Symptom
Combination on cbQoS cache and telemetry collection from qos PD component may block LC qos stats processing:
#show processes blocked location 0/X/cpu0
Fri Apr 26 12:04:33.347 CEST
Jid Pid Tid ProcessName State TimeInState Blocked-on
203 7465 7465 qos_ma_ea Reply 0509:42:03.0269 6502 qos_ma
207 6502 6502 qos_ma Reply 0509:42:03.0275 5813 ifmgr
Restarting qos_ma on LC fixes the issue for some time, until next blocking state observed.
Conditions
[1] Observed on eXR/cXR running 6.5.2/6.5.3
[2] cbQoS cache enabled 'snmp-server mibs cbqosmib cache'
[3] telemetry requests PD QoS stats = Cisco-IOS-XR-asr9k-qos-oper:platform-qos
Suspected conditions to run into the issue:
- 10x10 breakout interface with QoS policy applied
- high amount (>10) of interface with QoS policy applied
Workaround
Remove QoS policy from 10x10 breakout interface and restarting qos_ma process on relevant LC in some scenarios stop issue from appearing again.
Further Problem Description