
OPERATIONAL DEFECT DATABASE
...

...
The stack_mgr process can unexpectedly fail on a Switch stack: Chassis 1 reloading, reason - EHSA keepalive timeout Apr 1 11:22:07.739: %PMAN-3-RPSWITCH: R0/0: RP switch initiated. Critical process stack_mgr has failed (rc 0) Apr 1 11:22:16.452: %PMAN-3-PROCHOLDDOWN: R0/0: The process stack_mgr has been helddown (rc 139) Apr 1 11:22:22.078: %PMAN-5-EXITACTION: F0/0: pvp: Process manager is exiting: reload fp action requested INFO: rcu_sched detected stalls on CPUs/tasks: 3-...: (9 ticks this GP) idle=b25/140000000000000/0 softirq=57412647/57412647 fqs=5049 (detected by 2, t=5288 jiffies, g=26983533, c=26983532, q=54430) NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [flashutil:2940] Kernel panic - not syncing: softlockup: hung tasks CPU: 1 PID: 2940 Comm: flashutil Tainted: G W O L 4.4.155 #1 Hardware name: Cisco Craw64 on DopplerG 2.0 (DT) Call trace: [] dump_backtrace+0x0/0x148 [] show_stack+0x14/0x20 [] dump_stack+0x98/0xbc [] panic+0xf8/0x25c [] watchdog+0x0/0x48 [] __hrtimer_run_queues+0xf0/0x178 [] hrtimer_interrupt+0x98/0x1c8 [] arch_timer_handler_phys+0x30/0x40 [] handle_percpu_devid_irq+0x78/0xa0 [] generic_handle_irq+0x24/0x38 [] __handle_domain_irq+0x5c/0xb8 [] gic_handle_irq+0x58/0xb0
The crash occurs when there's a large number of Dot1x sessions active across the stack, and the "default interface range" command is used on a number of interfaces at once.
Issue has only been seen in the 16.11.x releases, 16.12.3 and later releases don't see this issue.
Click on a version to see all relevant bugs
Cisco Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.