BugZero | VMware BugID 81146 - Resolve RabbitMQ cluster issues in vRA 8.x deploym...

OPERATIONAL DEFECT DATABASE

...

BugZero | VMware BugID 81146 - Resolve RabbitMQ cluster issues in vRA 8.x deploym...

VMware - Defect ID: 81146

Resolve RabbitMQ cluster issues in vRA 8.x deployment

VMware - Defect ID: 81146

Resolve RabbitMQ cluster issues in vRA 8.x deployment

Last updated on November 10th, 2020

BugZero Risk Score
5.3 Medium

Overall: N/A

Severity: N/A

Community: N/A

Lifecycle: N/A

What is the BugZero Risk Score?

VMware Integration

Learn more about where this data comes from

VMware Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Bug Details

Description

Symptoms

Deployment requests are stuck in different life-cycle states for a long time until time-out is reached.

Cause

Suspending vRA node or network partitioning between the vRA nodes cluster deployments, will result in miss-connectivity between the RabbitMQ cluster members, which could lead to de-clustered RabbitMQ.

Resolution

There is no resolution for the issue at the moment as this depends on the *RabbitMQ* cluster resilience.

Workaround

To work around the issue, Reset the RabbitMQ cluster: SSH login to one of the nodes in the vRA cluster.Check the rabbitmq-ha pods status: root@vra-node [ ~ ]# kubectl -n prelude get pods --selector=app=rabbitmq-haNAME READY STATUS RESTARTS AGErabbitmq-ha-0 1/1 Running 0 3d16hrabbitmq-ha-1 1/1 Running 0 3d16hrabbitmq-ha-2 1/1 Running 0 3d16h If all rabbitmq-ha pods are healthy, check the RabbitMQ cluster status for each of them: seq 0 2 | xargs -n 1 -I {} kubectl exec -n prelude rabbitmq-ha-{} -- bash -c "rabbitmqctl cluster_status"NOTE: Analyze the command output for each RabbitMQ node and verify that the "running_nodes" list contains all cluster members from the "nodes > disc" list::[{nodes,[{disc,['rabbit@rabbitmq-ha-0.rabbitmq-ha-discovery.prelude.svc.cluster.local','rabbit@rabbitmq-ha-1.rabbitmq-ha-discovery.prelude.svc.cluster.local','rabbit@rabbitmq-ha-2.rabbitmq-ha-discovery.prelude.svc.cluster.local']}]},{running_nodes,['rabbit@rabbitmq-ha-2.rabbitmq-ha-discovery.prelude.svc.cluster.local','rabbit@rabbitmq-ha-1.rabbitmq-ha-discovery.prelude.svc.cluster.local','rabbit@rabbitmq-ha-0.rabbitmq-ha-discovery.prelude.svc.cluster.local']}...] If the "running_nodes" list doesn't contain all rabbitMQ cluster members, RabbitMQ is in de-clustered state and needs to be manually reconfigured. For example: [{nodes, [{disc,['rabbit@rabbitmq-ha-0.rabbitmq-ha-discovery.prelude.svc.cluster.local','rabbit@rabbitmq-ha-1.rabbitmq-ha-discovery.prelude.svc.cluster.local','rabbit@rabbitmq-ha-2.rabbitmq-ha-discovery.prelude.svc.cluster.local']}]},{running_nodes,['rabbit@rabbitmq-ha-0.rabbitmq-ha-discovery.prelude.svc.cluster.local']}..] To reconfigure the RabbitMQ cluster, complete the steps below: SSH login to one of the vRA nodesReconfigure the RabbitMQ cluster: "vracli reset rabbitmq" root@vra_node [ ~ ]# vracli reset rabbitmq'reset rabbitmq' is a destructive command. Type 'yes' if you want to continue, or 'no' to stop: yes Wait until all rabbitmq-ha pods are re-created and healthy: "kubectl -n prelude get pods --selector=app=rabbitmq-ha" NAME READY STATUS RESTARTS AGErabbitmq-ha-0 1/1 Running 0 9m53srabbitmq-ha-1 1/1 Running 0 9m35srabbitmq-ha-2 1/1 Running 0 9m14s Delete the ebs pods: "kubectl -n prelude delete pods --selector=app=ebs-app".Wait until all ebs pods are re-created and ready: "kubectl -n prelude get pods --selector=app=ebs-app". NAME READY STATUS RESTARTS AGEebs-app-84dd59f4f4-jvbsf 1/1 Running 0 2m55sebs-app-84dd59f4f4-khv75 1/1 Running 0 2m55sebs-app-84dd59f4f4-xthfs 1/1 Running 0 2m55s The RabbitMQ cluster is reconfigured. Request a new Deployment to verify that it completes successfully.

Change history

Top VMware Defects

No bugs this month

VMware Integration

Learn more about where this data comes from

VMware Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

VMware - Defect ID: 81146

Resolve RabbitMQ cluster issues in vRA 8.x deployment

VMware - Defect ID: 81146

Resolve RabbitMQ cluster issues in vRA 8.x deployment

Last updated on November 10th, 2020

BugZero Risk Score5.3 Medium

Bug Details

Symptoms

Cause

Resolution

Workaround

Links

Top VMware Defects

Ready to prevent the next vendor outage?

BugZero Risk Score
5.3 Medium