
OPERATIONAL DEFECT DATABASE
...

...
Memory leak at kernel level with pubd process triggering unexpected reloads on C9800 wireless controllers. Memory leak observed on 17.09.02, 17.09.03 IOS-XE: show platform software status control brief Load Average Slot Status 1-Min 5-Min 15-Min 1-RP0 Healthy 0.82 0.68 0.65 2-RP0 Healthy 0.20 0.46 0.38 Memory (kB) Slot Status Total Used (Pct) Free (Pct) Committed (Pct) 1-RP0 Critical 32356512 31981540 (99%) 374972 ( 1%) 34712256 (107%) 2-RP0 Healthy 32356512 5099088 (16%) 27257424 (84%) 7525456 (23%) show platform software process memory chassis active r0 all sorted Pid RSS PSS Heap Shared Private Name -------------------------------------------------------------------------- 24871 26252896 25930318 25803252 395988 25856908 pubd <<<<<<< RSS value is always increasing. 4499 1256916 1123343 464 177872 1079044 linux_iosd-imag
Memory usage for pubd process is always increasing because of telemetry connection flap An unexpected reload is observed once 99% of memory usage is reached due the memory leak condition.
- Reload/Switchover the active controller in a planned maintenance window to recover memory. Or - Remove gRPC subscription just to stop the leak, but not to recover memory. Or - Unconfigure and Configure netconf-yang to restart the pubd process. Or - Ensure that telemetry connection is stable and not flapping
The leak is related to telemetry configuration. In the tech reports it can be observed sets of gRPC subscriptions. + Commands to monitor memory and track if pubd process leaking memory: show platform software status control brief show platform resources show process memory platform sorted show process memory platform accounting show platform software process memory chassis active r0 all sorted show platform software process memory chassis standby r0 all sorted + If detected that pubd is constantly increasing memory usage RSS, clear counters: debug platform software memory mdt-pubd chassis active r0 alloc callsite stop debug platform software memory mdt-pubd chassis active r0 alloc callsite clear debug platform software memory mdt-pubd chassis active r0 alloc callsite start + Collect many samples with the following command and determine which call site has diif_call more significantly increasing: show platform software memory mdt-pubd chassis active r0 alloc callsite brief callsite thread diff_byte diff_call ---------------------------------------------------------------- 0AFFF6C300FCC006 27519 781824 509 9423E051484E0000 27519 524992 26 0AFFF6C300FCC002 27519 524809 15279 <<<<<<<<<<<<<<<<< 108CA0D9C0F0007D 27519 371488 453 + Use callsite id with the following command debug platform software memory mdt-pubd chassis active r0 alloc backtrace start 0AFFF6C300FCC002 depth 10 + And share the below command output with Cisco TAC: show platform software memory mdt-pubd chassis active r0 alloc backtrace If it is determined that system 9800-40 and 9800-80 is affected by this bug, SMU is available to resolve it: https://software.cisco.com/download/home/286316412/type/286308587/release/17.9.3 C9800-universalk9_wlc.17.09.03.CSCwf60151.SPA.smu.bin
Click on a version to see all relevant bugs
Cisco Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.