...
You are running VMware Integrated Openstack 6.xnova-consoleauth pod has restarted numerous times.From the management server run: osctl logs nova-consoleauth-<id of pod> | less In the nova-consoleauth-<id> logs, you see entries similar to: 2019-08-22 02:24:31.286 1 ERROR oslo_service.service Traceback (most recent call last):2019-08-22 02:24:31.286 1 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 796, in run_service...2019-08-22 02:24:31.286 1 ERROR oslo_service.service Traceback (most recent call last):2019-08-22 02:24:31.286 1 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 796, in run_service nova-compute pod has restarted numerous times.From the management server run: osctl logs nova-compute-01-compute-<id of pod> | less In the nova-compute logs, you see entries similar to: 019-08-22 02:31:43.619 1 ERROR nova Traceback (most recent call last):2019-08-22 02:31:43.619 1 ERROR nova File "/usr/bin/nova-compute", line 10, in <module>2019-08-22 02:31:43.619 1 ERROR nova sys.exit(main())...2019-08-22 02:31:43.619 1 ERROR nova MessageDeliveryFailure: Unable to connect to AMQP server on rabbitmq.openstack.svc.cluster.local:5672 after inf tries: Queue.declare: (541) INTERNAL_ERROR - Cannot declare a queue 'queue 'reply_23425c42c8774a17a48f37c35442667b' in vhost 'nova'' on node 'rabbit@rabbitmq1-rabbitmq-1.rabbitmq1-dsv-59862a.openstack.svc.cluster.local': {vhost_supervisor_not_running,<<"nova">>} Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
From the management server run: osctl logs rabbitmq1-rabbitmq-<id of rabbitmq pod> | less Checking the rabbitmq logs, you see entries similar to: 2019-08-29 06:33:47 =ERROR REPORT====** Node 'rabbit@rabbitmq1-rabbitmq-2.rabbitmq1-dsv-59862a.openstack.svc.qtang.local' not responding **** Removing (timedout) connection **2019-08-29 06:33:47 =ERROR REPORT====** Node 'rabbit@rabbitmq1-rabbitmq-0.rabbitmq1-dsv-59862a.openstack.svc.qtang.local' not responding **** Removing (timedout) connection **2019-08-29 06:34:38 =ERROR REPORT====Mnesia('rabbit@rabbitmq1-rabbitmq-1.rabbitmq1-dsv-59862a.openstack.svc.qtang.local'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'rabbit@rabbitmq1-rabbitmq-2.rabbitmq1-dsv-59862a.openstack.svc.qtang.local'} Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment
RabbitMQ encountered a network partition. Check for and resolve: Network errorsNetwork latency
Restarting the crashed rabbitmq pod. If there are still underlying network problems, the issues will recur. osget pods | grep rabbitmq1-rabbitmq root@oms [ ~ ]# osget pods | grep rabbitmq1-rabbitmqrabbitmq1-rabbitmq-0 1/1 Running 0 14drabbitmq1-rabbitmq-1 1/1 Running 0 14drabbitmq1-rabbitmq-2 1/1 Running 0 14d osdel pods rabbitmq1-rabbitmq-<id of rabbitmq pod>