...
Missing some device and metric information in various reports.Data sent to the Primary Backend (Backend0) failover filter is potentially consuming large amounts of disk space.Metrics (which should stored on the Backend) are not getting pushed to the Backend so therefore are not showing in the reports.WARNING and SEVERE messages in logs from remote collectors Load Balancer connectors are observed.Load Balancer connectors cannot write data to the primary backend and logs under /opt/APG/Collecting/Collector-Manager/Load-Balancer/logs/collecting-1-0.log shows the following errors: WARNING -- [2014-09-05 15:09:17 EEST] -- LoadFactorDecision$EndPointLoadFactor::syncLoadFactor():Couldn't synchronize the load factor for end point Backend8. Reverting to current load of 85597.0 and a limit of 750000.0: The server sent HTTP status code 403: ForbiddenSEVERE -- [2014-09-05 15:12:18 EEST] -- SocketConnector::sendBuffer(): Can't write to <FQDN/x.x.x.x>:2000java.io.IOException: Connection reset by peerWARNING -- [2014-09-05 15:12:18 EEST] -- AbstractCollector::pushNext(): Error pushing 1409857142: PipeError: Can't write to <FQDN/x.x.x.x:>2000!WARNING -- [2015-09-24 15:06:58 AST] -- AbstractCollector::pushNext(): Timeout while attempting to write data to Connector-Backend0WARNING -- [2015-09-24 15:08:03 AST] -- AbstractCollector::pushNext(): Timeout while attempting to write data to Connector-Backend0WARNING -- [2015-09-24 15:09:08 AST] -- AbstractCollector::pushNext(): Timeout while attempting to write data to Connector-Backend0WARNING -- [2015-09-24 15:10:13 AST] -- AbstractCollector::pushNext(): Timeout while attempting to write data to Connector-Backend0WARNING -- [2015-09-24 15:11:18 AST] -- AbstractCollector::pushNext(): Timeout while attempting to write data to Connector-Backend0 In other cases, write failure messages may also be observed as below: WARNING -- [2018-08-07 09:13:42 EDT] -- AbstractCollector::pushNext(): Error pushing (r)1533647612: group::VMAX-ARRAY-HEALTHSCORE...)=100.0 to Connector-Backend0com.watch4net.apg.v2.collector.PipeError: java.io.IOException: Cannot write to <FQDN/x.x.x.x>:2000 at com.watch4net.apg.v2.collector.plugins.negotiation.NegotiatingConnector.pushData(NegotiatingConnector.java:161) at com.watch4net.apg.v2.collector.plugins.loadbalancer.decision.chain.AbstractEndPointElement.pushData(AbstractEndPointElement.java:51) at com.watch4net.apg.v2.collector.AbstractCollector.pushNext(AbstractCollector.java:56) at com.watch4net.apg.v2.collector.plugins.FailOverFilter.sendValue(FailOverFilter.java:327) at com.watch4net.apg.v2.collector.plugins.FailOverFilter.pushData(FailOverFilter.java:577)(...)Caused by: java.io.IOException: Cannot write to <FQDN/x.x.x.x>:2000 at com.emc.watch4net.socket.communicator.client.AbstractNegotiatingClient.sendBuffer(AbstractNegotiatingClient.java:141) at com.emc.watch4net.socket.communicator.client.AbstractNegotiatingClient.sendBuffer(AbstractNegotiatingClient.java:171) at com.emc.watch4net.socket.communicator.client.AbstractNegotiatingClient.commit(AbstractNegotiatingClient.java:185) at com.emc.watch4net.socket.communicator.client.AbstractNegotiatingClient.write(AbstractNegotiatingClient.java:121) at com.watch4net.apg.v2.collector.plugins.negotiation.NegotiatingConnector.pushData(NegotiatingConnector.java:156) ... 85 moreINFO -- [2018-08-07 09:13:42 EDT] -- FailOverFilter::switchToFailOverMode(): The failover filter Filter_1-Backend0 is switching to FailOver mode...WARNING -- [2018-08-07 09:14:52 EDT] -- AbstractCollector::flushNext(): Error flushing Connector-Backend0com.watch4net.apg.v2.collector.PipeError: java.io.IOException: Cannot write to<FQDN/x.x.x.x>:2000 at com.watch4net.apg.v2.collector.plugins.negotiation.NegotiatingConnector.flushData(NegotiatingConnector.java:176) at com.watch4net.apg.v2.collector.plugins.loadbalancer.decision.chain.AbstractEndPointElement.flushData(AbstractEndPointElement.java:63) at com.watch4net.apg.v2.collector.AbstractCollector.flushNext(AbstractCollector.java:85) at com.watch4net.apg.v2.collector.plugins.FailOverFilter.access$1400(FailOverFilter.java:53) at com.watch4net.apg.v2.collector.plugins.FailOverFilter$CheckNextComponentAvailabilityTimerTask.run(FailOverFilter.java:906) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505)Caused by: java.io.IOException: Cannot write to <FQDN/x.x.x.x>:2000 at com.emc.watch4net.socket.communicator.client.AbstractNegotiatingClient.sendBuffer(AbstractNegotiatingClient.java:141) at com.emc.watch4net.socket.communicator.client.AbstractNegotiatingClient.sendBuffer(AbstractNegotiatingClient.java:171) at com.emc.watch4net.socket.communicator.client.AbstractNegotiatingClient.commit(AbstractNegotiatingClient.java:185) at com.emc.watch4net.socket.communicator.client.AbstractNegotiatingClient.flush(AbstractNegotiatingClient.java:226) at com.watch4net.apg.v2.collector.plugins.negotiation.NegotiatingConnector.flushData(NegotiatingConnector.java:170) ... 6 moreWARNING -- [2018-08-07 09:14:52 EDT] -- FailOverFilter$CheckNextComponentAvailabilityTimerTask::run(): Error while pushing to next component. Next component is not available. In some instances in the collector host netstat -an output you will see FIN_WAIT status: tcp 0 62481 Collector IP:52140 PBE IP:2000 FIN_WAIT1 Local Host Firewall log or Host System Event log shows errors similar to below: terminating tcp proxy connection from servers < x.x.x.x>/2000 to < x..x.x.x> /45989reassembly limit of 8192 bytes exceeded
Switches use port 2000 for the internal communication purpose thereby, causing high traffic on 2000 and leading to dropping of data due to contention.
To workaround this issue, use a different port for the Primary Backend such as 2005 or 2006 and verify it is open and listening. For vApp environments, the following command should be executed on the Primary Backend server in order to open the port in the server, e.g.: /usr/sbin/enable_firewall_port.sh TCP 2005 Change the port from "2000" to a different port (e.g. 2005 or 2006) as follows: On the Primary Backend host, edit the following file: /opt/APG/Backends/APG-Backend/Default/conf/socketinterface.xmlIf present, also edit the file below:/opt/APG/Backends/APG-Backend/Default/conf/negotiating-socket-interface.xmlLocate the below section and change the port from 2000 to 2005 in the specified XML files.<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE config SYSTEM "server.dtd"><config> <listen>2000</listen></config> Go to the below path: /opt/APG/Collecting/Load-Balancer/Load-Balancer/conf/socket-backend-0.xmlLocate the below section and change the port from 2000 to 2005 in the specified file:<config> <host>x.x.x.x</host> <port>2000</port> Restart the Backend and Load-Balancer services on the backend host. On the collector hosts, find the following file and verify that the port number shows as 2005: /opt/APG/Collecting/Load-Balancer/Load-Balancer/tmp/Backend0/xx/Connector.xml<config> <host>x.x.x.x</host> <port>2005</port> <buffer-size>32768</buffer-size> <retry-count>0</retry-count> <retry-timeout>5000</retry-timeout> <connect-timeout>60000</connect-timeout> <write-timeout>5000</write-timeout> Delete or move all temp files under /opt/APG/Collecting/Load-Balancer/Load-Balancer/tmp/ and restart the collector service in order to get the updated files from the Load Balancer (LB) arbiter, then the LB connector will start pushing the data to that Backend.
Cisco consider port 2000 as the SCCP (Skinny Client Control Protocol), this causes the communication issues between the Load-Balancer(s) and the Primary Backend. It is also recommended to verify that all other socket interface ports for the other databases (and the telnet interface ports) be verified as well, ex. 2000/2001, 2100/2101, etc. A Cisco ASA/PIX device is denying any packet which more than 8912 byte for 2000 TCP port specifically The following Cisco Support Community documents the issue https://supportforums.cisco.com/document/97971/asa-terminating-tcp-proxy-connection-xxxx-reassembly-limit-8192-bytes-exceededYou can use a "sniffer" type tool such as nmap/zenmap, wireshark, etc... to confirm if there is some type of packet inspection going on for port 2000.