Symptom
-Two of the rings which are acting as Non-RPL owner on ASR9010 running eXR-6.6.3 are not able to process R-APS(NR,RB) control packets sent by RPL owner via OLT's.
- Few of the ERPS show commands and all the ERPS debugs are not generating any results during issue state and throwing an error as you can see below
Error: Failed to retrieve "StatisticsTable" due to the following error:
'sysdb' detected the 'warning' condition 'An EDM took too long to process a request and was timed out'
P/0/RSP1/CPU0:MHSNRMSNAAR002#sh ethernet ring g8032 statistics ENTERPRISE-51
Conditions
>> ifmgr process restart - spio client dll running on erp_io got the IM reconnection open event
>>
show spio trace all location RSP1 -->
Feb 26 07:04:52.838 spio/client-event 0/RSP1/CPU0 t17631 Client JID 1004 ERP [Active]: IM disconnect
Feb 26 07:04:52.838 spio/error 0/RSP1/CPU0 t17631 Client JID 1004 ERP [Active] Async error (op id 0x0003b120): Async resync message error returned asyncronously, rc: 0x409a0c00 - 'ifmgr' detected the 'warning' condition 'Failed to contact the server'
Above error suggests that deadlock in active RSP erp_io process . we can use above trace and filter with ERP or (JID Of ERP) to see we are hitting the issue of deadlock.
Workaround
process restart erp_io
Further Problem Description
- The ERPS rings which are acting as a RPL owner are not facing any issue with control packet communications.
- The impacted rings are running on two ASR9K boxes having different codes i.e. 6.6.3 eXR and 6.6.3 cXR. Issue is observed on node having eXR version.