...
Attempt to Add new RP cluster to a working system.Connect cluster fails with error "Connect clusters failed. Could not reach the remote vRPA cluster (IP xxx.xxx.xxx.xxx)."This error will appear on WDM / Deployment Manager and in the dm.log:2019-05-10 15:05:21,354 [CommandWorker-3] (Command.java:88) ERROR - Command#run() ServerException errorType[CONNECT_CLUSTER] errorMessage[Connect clusters failed. Could not reach the remote vRPA cluster (IP xxx.xxx.xxx.xxx).Verify the remote vRPAs are up and reachable from current cluster and run the wizard again.] failure UID: 412886e2-35a9-453d-95fd-8e72d9f205ddcom.kashya.installation.server.exceptions.ConnectClusterException: Connect clusters failed. Could not reach the remote vRPA cluster (IP xxx.xxx.xxx.xxx).Verify the remote vRPAs are up and reachable from current cluster and run the wizard again. at com.kashya.installation.server.commands.global.AddSiteCommand.analyzeAndThrowRelevantConnectClusterExceptions(AddSiteCommand.java:890) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.addDistributeClusterCertificatesWorker(AddSiteCommand.java:574) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.distributeCertificateInCluster(AddSiteCommand.java:535) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.exchangeAndDistributeClusterCertificates(AddSiteCommand.java:517) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.getNewSiteConfiguration(AddSiteCommand.java:327) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.execute(AddSiteCommand.java:165) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.execute(AddSiteCommand.java:28) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.Command.runNormal(Command.java:109) [com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.Command.run(Command.java:48) [com.kashya.recoverpoint.installation.server.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181] at com.kashya.installation.server.ThreadPoolFactory$1.run(ThreadPoolFactory.java:43) [com.kashya.recoverpoint.installation.server.jar:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]2019-05-10 15:05:21,354 [CommandWorker-3] (Command.java:55) INFO - Command[AddSiteCommand] Command Ended [transactionID=51]2019-05-10 15:05:21,460 [http-bio-8081-exec-9] (CommandsExecutor.java:181) INFO - getResult() - Remove completed transaction [transactionID=51] Command[null] result[Result(errorMessage=Connect clusters failed. Could not reach the remote vRPA cluster (IP xxx.xxx.xxx.xxx).Verify the remote vRPAs are up and reachable from current cluster and run the wizard again., errorType=CONNECT_CLUSTER, errorID=412886e2-35a9-453d-95fd-8e72d9f205dd, value=null, completedSuccessfully=false)] More errors in the logs:2019-05-10 15:05:21,352 [DistributeClusterCertificatesWorker_collection_0xYYYYYYYYYYYYYYYY-3] (CommandWorkerListener.java:68) INFO - DistributeClusterCertificatesWorker_collection_0xYYYYYYYYYYYYYYYY-3 progressed 0 stages - completed2019-05-10 15:05:21,353 [DistributeClusterCertificatesWorker_collection_0xYYYYYYYYYYYYYYYY-3] (CommandWorkerListener.java:88) INFO - In updateTimeLeft2019-05-10 15:05:21,353 [DistributeClusterCertificatesWorker_collection_0xYYYYYYYYYYYYYYYY-3] (CommandWorkerListener.java:73) INFO - Current progress is: At 0 of 0 (0%) - completed , time left: 02019-05-10 15:05:21,353 [DistributeClusterCertificatesWorker_collection_0xYYYYYYYYYYYYYYYY-3] (CommandWorkerListener.java:56) INFO - Removing worker DistributeClusterCertificatesWorker_collection_0xYYYYYYYYYYYYYYYY-32019-05-10 15:05:21,353 [DistributeClusterCertificatesWorker_collection_0xYYYYYYYYYYYYYYYY-3] (CommandWorkerListener.java:58) INFO - workers remaining: 02019-05-10 15:05:21,353 [CommandWorker-3] (AddSiteCommand.java:882) WARN - Failed to distribute certificate.com.kashya.installation.server.exceptions.CommandFailedException: Operation failed. Operation failed. ClientTransportException: HTTP transport error: java.net.ConnectException: Connection timed out (Connection timed out) at com.kashya.installation.server.commands.global.WaitingWorkerResultListener.getResult(WaitingWorkerResultListener.java:39) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.addDistributeClusterCertificatesWorker(AddSiteCommand.java:572) [com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.distributeCertificateInCluster(AddSiteCommand.java:535) [com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.exchangeAndDistributeClusterCertificates(AddSiteCommand.java:517) [com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.getNewSiteConfiguration(AddSiteCommand.java:327) [com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.execute(AddSiteCommand.java:165) [com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.global.AddSiteCommand.execute(AddSiteCommand.java:28) [com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.Command.runNormal(Command.java:109) [com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.commands.Command.run(Command.java:48) [com.kashya.recoverpoint.installation.server.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181] at com.kashya.installation.server.ThreadPoolFactory$1.run(ThreadPoolFactory.java:43) [com.kashya.recoverpoint.installation.server.jar:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]Caused by: com.kashya.installation.server.exceptions.CommandFailedException: Operation failed. ClientTransportException: HTTP transport error: java.net.ConnectException: Connection timed out (Connection timed out) at com.kashya.installation.server.commands.internal.DistributeClusterCertificatesWorker.enterResult(DistributeClusterCertificatesWorker.java:38) ~[com.kashya.recoverpoint.installation.server.jar:?] at com.kashya.installation.server.infocollect.ClientWorker.run(ClientWorker.java:175) ~[com.kashya.recoverpoint.installation.server.jar:?]
Although the errors point to a specific IP with an issue (WAN IP of RPA1), the issue with a different IP.When the system is adding the new site, it needs to update all new RPAs with the cluster certificate.The error comes up when the system cannot update the certificate on any RPA, in the case above it is Kbox 3 or RPA 4.
Resolution:Identify the RPA that is having issues updating its certificate, in this case:2019-05-10 15:05:21,353 [DistributeClusterCertificatesWorker_collection_0xYYYYYYYYYYYYYYYY-3] (CommandWorkerListener.java:58) INFO - workers remaining: 02019-05-10 15:05:21,353 [CommandWorker-3] (AddSiteCommand.java:882) WARN - Failed to distribute certificate.This is Kbox 3 = RPA 4 of the new cluster.Check for connectivity issues to this specific RPA from the Site Control RPA of the original cluster, resolve them and retry Add cluster operation.