Skip to main content

 RabbitMQ service not starting and showing red in vIDM dashboard

https://knowledge.broadcom.com/external/article?articleNumber=367757

Products

VMware Aria Suite

Issue/Introduction

Symptoms:

  • During vIDM boot RabbitMQ is not starting 
  • There is an error on vIDM dashboard with "There was a problem Messaging service Error retrieving RabbitMQ status"
  • Any interaction with RabbitMQ is failing
    root@idm [ ~ ]# rabbitmqctl stop_app
    Stopping rabbit application on node rabbitmq@vm-idm ...
    Error: unable to perform an operation on node 'rabbitmq@idm'. Please see diagnostics information and suggestions below.
          
    root@idm [ ~ ]# rabbitmqctl force_reset
    Error: unable to perform an operation on node 'rabbitmq@idm'. Please see diagnostics information and suggestions below.
         
    root@idm [ ~ ]# rabbitmqctl start_app
    Starting node rabbitmq@vm-idm ...
    Error: unable to perform an operation on node 'rabbitmq@idm'. Please see diagnostics information and suggestions below.
  • Checking RabbitMQ service shows a crash dump was written
  • You can see the following message in the horizon log (/opt/vmware/horizon/workspace/logs/horizon.log)  "Messaging Connection: Messaging connection test failed"
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] ###.vmware.horizon.messaging.channel.http.HttpChannel - Stop resending message to: http://127.0.0.1/AUDIT/API/1.0/REST/audit/consume. Status code: 500
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] ###.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.uuid] message added back to queue because: Cannot send message to: AnalyticsHttpChannel[callbackUri=http://127.0.0.1/AUDIT/API/1.0/REST/audit/consume,serviceAuthTokenProvider=###.vmware.horizon.components.identity.accesscontrol.ServiceAuthTokenProvider@########,sslUtils=###.vmware.horizon.security.utils.SSLUtils@########,defaultHttpClient=org.apache.http.impl.client.InternalHttpClient@########,authMetadata=,httpPost=] (fail.send.callback.uri). [DeliveryTag:993]
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] ###.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.uuid] is retrying current message for 3th time
    <TIMESTAMP> INFO  (subscriber-thread-285) [;;;] ###.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.uuid] has one message requeued.
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] ###.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.uuid] reached more than 10 errors in a row.

    As more and more messages piled up in RabbitMQ. It could eat up all the hard disk space for RabbitMQ. Thus RabbitMQ connection will be blocked, i.e. unhealthy. Check the log for below symptoms:

    <TIMESTAMP> INFO  (AMQP Connection 127.0.0.1:5672) [;;;] ###.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessagingProvider - Connection to localhost unblocked by RabbitMQ
    <TIMESTAMP> WARN  (subscriber-thread-285) [;;;] ###.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessageSubscriber - Subscriber [id: -.analytics.uuid] reached more than 1000 errors in a row, disabling.
    <TIMESTAMP> WARN  (AMQP Connection 127.0.0.1:5672) [;;;] ###.vmware.horizon.messaging.provider.rabbitmq.RabbitMQMessagingProvider - Connection to localhost blocked by RabbitMQ: low on disk

Environment

VMware Identity Manager 3.3.x

Resolution

  1. Take a Snapshot of the vIDM cluster
  2. Take horizon-workspace service offline on each node:
    # service horizon-workspace stop
  3. Reset RabbitMQ on each node:
    # rabbitmqctl reset
  4. Restart RabbitMQ on each node:
    # systemctl restart rabbitmq-server.service
  5. Start horizon-workspace service on each node:
    # service horizon-workspace start


Comments

Popular posts from this blog

  Issue with Aria Automation Custom form Multi Value Picker and Data Grid https://knowledge.broadcom.com/external/article?articleNumber=345960 Products VMware Aria Suite Issue/Introduction Symptoms: Getting  error " Expected Type String but was Object ", w hen trying to use Complex Types in MultiValue Picker on the Aria for Automation Custom Form. Environment VMware vRealize Automation 8.x Cause This issue has been identified where the problem appears when a single column Multi Value Picker or Data Grid is used. Resolution This is a known issue. There is a workaround.  Workaround: As a workaround, try adding one empty column in the Multivalue picker without filling the options. So we can add one more column without filling the value which will be hidden(there is a button in the designer page that will hide the column). This way the end user will receive the same view.  

57 Tips Every Admin Should Know

Active Directory 1. To quickly list all the groups in your domain, with members, run this command: dsquery group -limit 0 | dsget group -members –expand 2. To find all users whose accounts are set to have a non-expiring password, run this command: dsquery * domainroot -filter “(&(objectcategory=person)(objectclass=user)(lockoutTime=*))” -limit 0 3. To list all the FSMO role holders in your forest, run this command: netdom query fsmo 4. To refresh group policy settings, run this command: gpupdate 5. To check Active Directory replication on a domain controller, run this command: repadmin /replsummary 6. To force replication from a domain controller without having to go through to Active Directory Sites and Services, run this command: repadmin /syncall 7. To see what server authenticated you (or if you logged on with cached credentials) you can run either of these commands: set l echo %logonserver% 8. To see what account you are logged on as, run this command: ...
  The Guardrails of Automation VMware Cloud Foundation (VCF) 9.0 has redefined private cloud automation. With full-stack automation powered by Ansible and orchestrated through vRealize Orchestrator (vRO), and version-controlled deployments driven by GitOps and CI/CD pipelines, teams can build infrastructure faster than ever. But automation without guardrails is a recipe for risk Enter RBAC and policy enforcement. This third and final installment in our automation series focuses on how to secure and govern multi-tenant environments in VCF 9.0 with role-based access control (RBAC) and layered identity management. VCF’s IAM Foundation VCF 9.x integrates tightly with enterprise identity providers, enabling organizations to define and assign roles using existing Active Directory (AD) groups. With its persona-based access model, administrators can enforce strict boundaries across compute, storage, and networking resources: Personas : Global Admin, Tenant Admin, Contributor, Viewer Projec...