Symptoms
Observed in NetWorker 19.1 and laterData Domain devices get marked as 'suspect'A few minutes later they get back to 'normal'The problem can be seen around the time nsrim runs
Messages in logs:Apr 8 13:05:37 NW_server root: NetWorker index: (notice) nsrim has finished cross-checking the media databaseApr 8 13:05:37 NW_server root: NetWorker media: (warning) device_01 opening: Unable to connect to 'DDR_01' ([5002] [32993] [139996958791424] Wed Apr 8 13:05:37 2020Apr 8 13:05:37 NW_server root: #011ddp_connect_with_config() failed, Hostname: DDR_01, Err: 5002-max allowed connections exceeded).Apr 8 13:05:47 NW_server root: NetWorker device disabled: (warning) device `device_03` is automatically marked from `Normal` device to `Suspected` device.Apr 8 13:08:47 NW_server root: NetWorker device disabled: (warning) device `device_02` is automatically marked from `Suspected` device to `Normal` device.
Cause
Device Connectivity Check (DCC) is checking DDBoost devices every 3 minutes by default. Design is to mark devices as Suspect if DCC fails.It will revert back to Normal when checks are successful.The DCC failure in this situation was caused by too many concurrent connection to affected DDR, caused by nsrim attempting to mark expired files for deletion.
Resolution
Workaround is to disable DCC in NW Server Properties (globally), or in Storage Node Properties (individually).No operational impact is caused by disabling DCC.