...
Some services are listed as "stopped" on the Centralized Management UI or when checking the service status on the Module Manager: srmbe01:/opt/APG/bin # /opt/APG/bin/manage-modules.sh service status all * Checking 'topology-mapping-service Default'... [ running ] * Checking 'topology-service Default'... [ running ] * Checking 'webservice-gateway Default'... [ stopped ] * Checking 'mysql Default'... [ running ] * Checking 'alerting-backend Default'... [ running ] * Checking 'backend Default'... [ running ] * Checking 'collector-manager Load-Balancer'... [ running ] * Checking 'collector-manager emc-watch4net-health'... [ running ] * Checking 'event-processing-manager Alert-Consolidation'... [ running ] * Checking 'event-processing-manager Maintenance-Manager'... [ running ] * Checking 'event-processing-manager cisco-ucs'... [ running ] * Checking 'event-processing-manager emc-vnx'... [ running ] * Checking 'event-processing-manager vmware-vcenter'... [ running ] * Checking 'script-engine Default'... [ running ] * Checking 'task-scheduler Default'... [ running ] * Checking 'compliance-backend generic-compliance'... [ running ] Attempting to start the service succeeds, but the service stops immediately afterwards: srmbe01:/opt/APG/bin # /opt/APG/bin/manage-modules.sh service start webservice-gateway Default * Starting 'webservice-gateway Default'... [ OK ] srmbe01:/opt/APG/bin # /opt/APG/bin/manage-modules.sh service status webservice-gateway Default * Checking 'webservice-gateway Default'... [ stopped ] Under the log directory for the service, there are numerous concurrent instances for the log files (<LOG_FILE>-0-#.log), for example: srmbe01:/opt/APG/Tools/Webservice-Gateway/Default/logs # ls -l *-0-*.log -rw-r--r-- 1 apg apg 615133 Dec 21 14:57 gateway-0-0.log -rw-r--r-- 1 apg apg 3258 Dec 21 14:50 gateway-0-1.log The latest instance of the log file (e.g. gateway-0-1.log from above) has the following error or similar: SEVERE -- [2016-12-12 16:18:11 NZDT] -- HttpServer::start(): an error occured starting the serverjava.net.BindException: Address already in use There is no PID file located in the log directory (e.g. apg-webservice-gateway-default.pid).
The PID file in the log directory (e.g. apg-webservice-gateway-default.pid) is used by the Module Manager to monitor the process. If the file does not exist, the Module Manager will report the service as being "stopped". This can occur if the service did not stop properly or was hung and hence did not stop when requested.
To workaround this issue, the process should be terminated and restarted via the Module Manager: Stop all services on the host using the Module Manager: srmbe01:/opt/APG/bin # manage-modules.sh service stop all * Stopping 'topology-service Default'... [ OK ] * Stopping 'topology-mapping-service Default'... [ OK ] * Stopping 'task-scheduler Default'... [ OK ] * Stopping 'script-engine Default'... [ OK ] * Stopping 'event-processing-manager vmware-vcenter'... [ OK ] * Stopping 'event-processing-manager emc-vnx'... [ OK ] * Stopping 'event-processing-manager cisco-ucs'... [ OK ] * Stopping 'event-processing-manager Maintenance-Manager'... [ OK ] * Stopping 'event-processing-manager Alert-Consolidation'... [ OK ] * Stopping 'collector-manager emc-watch4net-health'... [ OK ] * Stopping 'collector-manager Load-Balancer'... [ OK ] * Stopping 'compliance-backend generic-compliance'... [ OK ] * Stopping 'backend Default'... [ OK ] * Stopping 'alerting-backend Default'... [ OK ] * Stopping 'mysql Default'... [ OK ] * Stopping 'webservice-gateway Default'... [ not-running ] Run the "ps -ef | grep -i apg" command to search for any processes that have not stopped and record the PID, as highlighted below: srmbe01:/opt/APG/bin # ps -ef | grep -i apg root 25486 21564 0 16:15 pts/0 00:00:00 grep -i apg apg 28997 1 2 11:35 ? 00:08:02 /opt/APG/Java/Sun-JRE/8.0.102/bin/java ... Kill the service and attempt to restart all the services: srmbe01:/opt/APG/bin # kill 28997 srmbe01:/opt/APG/bin # ./manage-modules.sh service start all * Starting 'topology-mapping-service Default'... [ OK ] * Starting 'topology-service Default'... [ OK ] * Starting 'webservice-gateway Default'... [ OK ] * Starting 'mysql Default'... [ OK ] * Starting 'alerting-backend Default'... [ OK ] * Starting 'backend Default'... [ OK ] * Starting 'collector-manager Load-Balancer'... [ OK ] * Starting 'collector-manager emc-watch4net-health'... [ OK ] * Starting 'event-processing-manager Alert-Consolidation'... [ OK ] * Starting 'event-processing-manager Maintenance-Manager'... [ OK ] * Starting 'event-processing-manager cisco-ucs'... [ OK ] * Starting 'event-processing-manager emc-vnx'... [ OK ] * Starting 'event-processing-manager vmware-vcenter'... [ OK ] * Starting 'script-engine Default'... [ OK ] * Starting 'task-scheduler Default'... [ OK ] * Starting 'compliance-backend generic-compliance'... [ OK ] Verify that the services stay running and confirm in the most recent log file that the service is started.