Menu

Cluster Formation and Peer Discovery

Introduction

This guide covers various automation-oriented cluster formation and peer discovery features. For a general overview of RabbitMQ clustering, please refer to the Clustering Guide.

This guide assumes general familiarity with RabbitMQ clustering and focuses on the peer discovery subsystem. For example, it will not cover what ports must be open for inter-node communication, how nodes authenticate to each other, and so on. Besides discovery mechanisms and their configuration, this guide also covers a closely related topic of rejoining nodes, the problem of initial cluster formation with nodes booting in parallel as well as additional health checks offered by some discovery implementations.

The guide also covers the basics of peer discovery troubleshooting.

What is Peer Discovery?

To form a cluster, new ("blank") nodes need to be able to discover their peers. This can be done using a variety of mechanisms (backends). Some mechanisms assume all cluster members are known ahead of time (for example, listed in the config file), others are dynamic (nodes can come and go).

All peer discovery mechanisms assume that newly joining nodes will be able to contact their peers in the cluster and authenticate with them successfully. The mechanisms that rely on an external service (e.g. DNS or Consul) or API (e.g. AWS or Kubernetes) require the service(s) to be available and reachable over HTTP(S) on their standard ports. Inability to reach the service will lead to node's inability to join the cluster.

Available Discovery Mechanisms

The following mechanisms are built into the core and always available:

Additional peer discovery mechanisms are available via plugins. The following peer discovery plugins ship with RabbitMQ as of 3.7.0:

The above plugins do not need to be installed but like all plugins they do need to be enabled before node start using rabbitmq-plugins' --offline mode:

rabbitmq-plugins --offline enable [plugin name]
A node with configuration settings that belong a non-enabled peer discovery plugin will fail to start and report those settings as unknown.

Specifying the Peer Discovery Mechanism

The discovery mechanism to use is specified in the config file, as are various mechanism-specific settings, for example, discovery service hostnames, credentials, and so on. cluster_formation.peer_discovery_backend is the key that controls what discovery module (implementation) is used:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_classic_config
The module has to implement the rabbit_peer_discovery_backend behaviour. Plugins therefore can introduce their own discovery mechanisms.

How Peer Discovery Works

When a node starts and detects it doesn't have a previously initialised database, it will check if there's a peer discovery mechanism configured. If that's the case, it will then perform the discovery and attempt to contact each discovered peer in order. Finally, it will attempt to join the cluster of the first reachable peer.

Depending on the backend (mechanism) used, the process of peer discovery may involve contacting external services, for example, an AWS API endpoint, a Consul node or performing a DNS query. Some backends require nodes to register (tell the backend that the node is up and should be counted as a cluster member): for example, Consul and etcd both support registration. With other backends the list of nodes is configured ahead of time (e.g. config file). Those backends are said to not support node registration. In some cases node registration is implicit or managed by an external service. AWS autoscaling groups is a good example: AWS keeps track of group membership, so nodes don't have to (or cannot) explicitly register. However, the list of cluster members is not predefined. Such backends usually include a no-op registration step and apply one of the race condition mitigation mechanisms described below.

When a cluster is first formed and there are no registered nodes yet, a natural race condition between booting nodes occurs. Different backends address this problem differently: some try to acquire a lock with an external service, others rely on randomized delays. This problem does not apply to the backends that require listing all nodes ahead of time.

When the configured backend supports registration, nodes unregister when they stop.

If peer discovery isn't configured, or it fails, or no peers are reachable, a node that wasn't a cluster member in the past will initialise from scratch and proceed as a standalone node.

If a node previously was a cluster member, it will try to contact its "last seen" peer for a period of time. In this case, no peer discovery will be performed. This is true for all backends.

Rejoining Nodes

A new node joining a cluster is just one possible case. Another common scenario is when an existing cluster member temporarily leaves and then rejoins the cluster.

Existing cluster members will not perform peer discovery. Instead they will try to contact their previously known peers.

If a node previously was a cluster member, when it boots it will try to contact its "last seen" peer for a period of time. If the peer is not booted (e.g. when a full cluster restart or upgrade is performed) or cannot be reached, the node will retry the operation a number of times. Default values are 10 retries and 30 seconds per attempt, respectively, or 5 minutes total. In environments where nodes can take a long and/or uneven time to start it is recommended that the number of retries is increased.

If a node is reset since losing contact with the cluster, it will behave like a blank node. Note that other cluster members might still consider it to be a cluster member, in which case there two sides will disagree and the node will fail to join. Such reset nodes must also be removed from the cluster using rabbitmqctl forget_cluster_node executed against an existing cluster member.

If a node was explicitly removed from the cluster by the operator and then reset, it can rejoin the cluster. In this case it will behave exactly like a blank node would.

How to Configure Peer Discovery

Peer discovery plugins are configured just like the core server and other plugins: using a config file.

cluster_formation.peer_discovery_backend is the key that controls what peer discovery backend will be used. Each backend will also have a number of configuration settings specific to it. The rest of the guide will cover configurable settings specific to a particular mechanism as well as provide examples for each one.

Environment variables can also be used to configure several mechanisms for easier migration from rabbitmq-autocluster. This method is highly discouraged, however: using environment variables is more error prone compared to the config file, and it is harder to verify effective configuration. Only those migrating clusters that use rabbitmq-autocluster should use environment variables for peer discoveery configuration, this guide leaves them out. Variable names are the same as used by rabbitmq-autocluster.

Config File Peer Discovery Backend

The most basic way for a node to discover its cluster peers is to read a list of nodes from the config file.

This is done using the cluster_formation.classic_config.nodes config setting.

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_classic_config

cluster_formation.classic_config.nodes.1 = [email protected]
cluster_formation.classic_config.nodes.2 = [email protected]

The following example demonstrates the same configuration in the classic config format. The 2nd member of the rabbit.cluster_nodes tuple is the node type to use for the current node. In the vast majority of cases all nodes should be disc nodes.

[
 {rabbit, [
           {cluster_nodes, {['[email protected]',
                             '[email protected]'], disc}}
          ]}
].

DNS Peer Discovery Backend

Another built-in peer discovery mechanism as of RabbitMQ 3.7.0 is DNS-based. It relies on a pre-configured hostname ("seed hostname") with DNS A (or AAAA) records and reverse DNS lookups to perform peer discovery. More specifically, this mechanism will perform the following steps:

  • Query DNS A records of the seed hostname.
  • For each returned DNS record's IP address, perform a reverse DNS lookup.
  • Append current node's prefix (e.g. rabbit in [email protected]) to each hostname and return the result.

For example, let's consider a seed hostname of discovery.eng.example.local. It has 2 DNS A records that return two IP addresses: 192.168.100.1 and 192.168.100.2. Reverse DNS lookups for those IP addresses return node1.eng.example.local and node2.eng.example.local, respectively. Current node's name is not set and defaults to [email protected]$(hostname). The final list of nodes discovered will contain two nodes: [email protected] and [email protected].

The seed hostname is set using the cluster_formation.dns.hostname config setting:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_dns

cluster_formation.dns.hostname = discovery.eng.example.local

Peer Discovery on AWS (EC2)

An AWS (EC2)-specific discovery mechanism is available via a plugin. It provides two ways for a node to discover its peers:

  • Using EC2 instance tags
  • Using AWS autoscaling group membership
Both methods rely on AWS-specific APIs (endpoints) and features and thus cannot work in other IaaS environments. Once a list of cluster member instances is retrieved, final node names are computed using instance hostnames or IP addresses.

When the AWS peer discovery mechanism is used, nodes will delay their startup for a randomly picked value to reduce the probability of a race condition during initial cluster formation.

Configuration and Credentials

Before a node can perform any operations on AWS, it needs to have a set of AWS account credentials configured. This can be done in a couple of ways:

  1. Via config file
  2. Using environment variables
EC2 Instance Metadata service for the region will also be consulted.

The following example snippet configures RabbitMQ to use the AWS peer discovery backend and provides information about AWS region as well as a set of credentials:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_aws

cluster_formation.aws.region = us-east-1
cluster_formation.aws.access_key_id = ANIDEXAMPLE
cluster_formation.aws.secret_key = WjalrxuTnFEMI/K7MDENG+bPxRfiCYEXAMPLEKEY
If region is left unconfigured, us-east-1 will be used by default. Sensitive values in configuration file can optionally be encrypted.

If an IAM role is assigned to EC2 instances running RabbitMQ nodes, a policy has to be used to allow said instances use EC2 Instance Metadata Service. When the plugin is configured to use Autoscaling group members, a policy has to grant access to describe autoscaling group members (instances). Below is an example of a policy that covers both use cases:

{
"Version": "2012-10-17",
"Statement": [
              {
              "Effect": "Allow",
              "Action": [
                         "autoscaling:DescribeAutoScalingInstances",
                         "ec2:DescribeInstances"
                         ],
              "Resource": [
                           "*"
                           ]
              }
              ]
}

Using Autoscaling Group Membership

When autoscaling-based peer discovery is used, current node's EC2 instance autoscaling group members will be listed and used to produce the list of discovered peers.

To use autoscaling group membership, set the cluster_formation.aws.use_autoscaling_group key to true:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_aws

cluster_formation.aws.region = us-east-1
cluster_formation.aws.access_key_id = ANIDEXAMPLE
cluster_formation.aws.secret_key = WjalrxuTnFEMI/K7MDENG+bPxRfiCYEXAMPLEKEY

cluster_formation.aws.use_autoscaling_group = true

Using EC2 Instance Tags

When tags-based peer discovery is used, the plugin will list EC2 instances using EC2 API and filter them by configured instance tags. Resulting instance set will be used to produce the list of discovered peers.

Tags are configured using the cluster_formation.aws.instance_tags key. The example below uses three tags: region, service, and environment.

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_aws

cluster_formation.aws.region = us-east-1
cluster_formation.aws.access_key_id = ANIDEXAMPLE
cluster_formation.aws.secret_key = WjalrxuTnFEMI/K7MDENG+bPxRfiCYEXAMPLEKEY

cluster_formation.aws.instance_tags.region = us-east-1
cluster_formation.aws.instance_tags.service = rabbitmq
cluster_formation.aws.instance_tags.environment = staging

Using Private EC2 Instance IPs

By default peer discovery will use private DNS hostnames to compute node names. It is possible to opt into using private IPs instead by setting the cluster_formation.aws.aws_use_private_ip key to true:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_aws

cluster_formation.aws.region = us-east-1
cluster_formation.aws.access_key_id = ANIDEXAMPLE
cluster_formation.aws.secret_key = WjalrxuTnFEMI/K7MDENG+bPxRfiCYEXAMPLEKEY

cluster_formation.aws.use_autoscaling_group = true
cluster_formation.aws.use_private_ip = true

Peer Discovery on Kubernetes

A Kubernetes-based discovery mechanism is available via a plugin.

With this mechanism, nodes fetch a list of their peers from the Kubernetes API endpoint using a set of configured values: a URI scheme, host, port, as as well as the token and certificate paths.

It is highly recommended that RabbitMQ clusters are deployed using a stateful set. If a stateless set is used recreated nodes will not have their persisted data and will start as blank nodes. This can lead to data loss and higher network traffic volume due to more frequent eager synchronisation of newly joining nodes. Stateless sets are also prone to the natural race condition during initial cluster formation, unlike stateful sets that initialise pods one by one.

To use Kubernetes for peer discovery, set the cluster_formation.peer_discovery_backend to rabbit_peer_discovery_k8s:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s

# Kubernetes API hostname (or IP address). Default value is kubernetes.default.svc.cluster.local
cluster_formation.k8s.host = kubernetes.default.example.local

It is possible to configure Kubernetes API port and URI scheme:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s

cluster_formation.k8s.host = kubernetes.default.example.local
# 443 is used by default
cluster_formation.k8s.port = 443
# https is used by default
cluster_formation.k8s.scheme = https

Kubernetes token file path is configurable via cluster_formation.k8s.token_path:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s

cluster_formation.k8s.host = kubernetes.default.example.local
# default value is /var/run/secrets/kubernetes.io/serviceaccount/token
cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
It must point to a local file that exists and is readable by RabbitMQ.

Certificate and namespace paths use cluster_formation.k8s.cert_path and cluster_formation.k8s.namespace_path, respectively:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s

cluster_formation.k8s.host = kubernetes.default.example.local
# default value is /var/run/secrets/kubernetes.io/serviceaccount/token
cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token

# default value is /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
cluster_formation.k8s.cert_path = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

# default value is /var/run/secrets/kubernetes.io/serviceaccount/namespace
cluster_formation.k8s.namespace_path = /var/run/secrets/kubernetes.io/serviceaccount/namespace
Just like with the token path key both must point to a local file that exists and is readable by RabbitMQ.

When a list of peer nodes is computed from a list of pod containers returned by Kubernetes, either hostnames or IP addresses can be used. This is configurable using the cluster_formation.k8s.address_type key:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s

cluster_formation.k8s.host = kubernetes.default.example.local

cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
cluster_formation.k8s.cert_path = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
cluster_formation.k8s.namespace_path = /var/run/secrets/kubernetes.io/serviceaccount/namespace

# should result set use hostnames or IP addresses
# of Kubernetes API-reported containers?
# supported values are "hostname" and "ip"
cluster_formation.k8s.address_type = hostname
Supported values are ip or hostname. hostname is the recommended option but has limitations: it can only be used with stateful sets (also highly recommended) and headless services. ip is used by default for better compatibility.

It is possible to append a suffix to peer hostnames returned by Kubernetes using cluster_formation.k8s.hostname_suffix:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s

cluster_formation.k8s.host = kubernetes.default.example.local

cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
cluster_formation.k8s.cert_path = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
cluster_formation.k8s.namespace_path = /var/run/secrets/kubernetes.io/serviceaccount/namespace

# no suffix is appended by default
cluster_formation.k8s.hostname_suffix = rmq.eng.example.local

Service name is rabbitmq by default but can be overridden using the cluster_formation.k8s.service_name key if needed:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s

cluster_formation.k8s.host = kubernetes.default.example.local

cluster_formation.k8s.token_path = /var/run/secrets/kubernetes.io/serviceaccount/token
cluster_formation.k8s.cert_path = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
cluster_formation.k8s.namespace_path = /var/run/secrets/kubernetes.io/serviceaccount/namespace

# overrides Kubernetes service name. Default value is "rabbitmq".
cluster_formation.k8s.service_name = rmq-qa

As mentioned above, stateful sets is the recommended way of running RabbitMQ on Kubernetes. Stateful set pods are initialised one at a time. That effectively addresses the natural race condition during the initial cluster formation. Randomized startup delay in such scenarios can use a significantly lower delay value range (e.g. 0 to 1 second):

cluster_formation.randomized_startup_delay_range.min = 0
cluster_formation.randomized_startup_delay_range.max = 2

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s

cluster_formation.k8s.host = kubernetes.default.example.local

# ...

Peer Discovery Using Consul

A Consul-based discovery mechanism is available via a plugin. Consul 0.8.0 and later versions are supported.

Nodes register with Consul on boot and unregister when they leave. Prior to registration, nodes will attempt to acquire a lock in Consul to reduce the probability of a race condition during initial cluster formation. When a node registers with Consul, it will set up a periodic health check for itself (more on this below).

To use Consul for peer discovery, set the cluster_formation.peer_discovery_backend to to rabbit_peer_discovery_consul:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

# Consul host (hostname or IP address). Default value is localhost
cluster_formation.consul.host = consul.eng.example.local

It is possible to configure Consul port and URI scheme:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local
# 8500 is used by default
cluster_formation.consul.port = 8500
# http is used by default
cluster_formation.consul.scheme = http

To configure Consul ACL token, use :

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local
cluster_formation.consul.acl_token = acl-token-value

Service name (as registered in Consul) defaults to "rabbitmq" but can be overridden:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local
# rabbitmq is used by default
cluster_formation.consul.svc = rabbitmq

Service hostname (address) as registered in Consul will be fetched by peers and therefore must resolve on all nodes. The hostname can be computed by the plugin or specified by the user. When computed automatically, a number of nodes and OS properties can be used:

  • Hostname (as returned by gethostname(2))
  • Node name (without the [email protected] prefix)
  • IP address of an NIC (network controller interface)
When cluster_formation.consul.svc_addr_auto is set to false, service name will be taken as is from cluster_formation.consul.svc_addr. When it is set to true, other options explained below come into play.

In the following example, the service address reported to Consul is hardcoded to hostname1.rmq.eng.example.local instead of being computed automatically from the environment:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local

cluster_formation.consul.svc = rabbitmq
# do not compute service address, it will be specified below
cluster_formation.consul.svc_addr_auto = false
# service address, will be communicated to other nodes
cluster_formation.consul.svc_addr = hostname1.rmq.eng.example.local
# use long RabbitMQ node names?
cluster_formation.consul.use_longname = true

In this example, the service address reported to Consul is parsed from node name (the [email protected] prefix will be dropped):

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local

cluster_formation.consul.svc = rabbitmq
# do compute service address
cluster_formation.consul.svc_addr_auto = true
# compute service address using node name
cluster_formation.consul.svc_addr_use_nodename = true
# use long RabbitMQ node names?
cluster_formation.consul.use_longname = true
cluster_formation.consul.svc_addr_use_nodename is a boolean field that instructs Consul peer discovery backend to compute service address using RabbitMQ node name.

In the next example, the service address is computed using hostname as reported by the OS instead of node name:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local

cluster_formation.consul.svc = rabbitmq
# do compute service address
cluster_formation.consul.svc_addr_auto = true
# compute service address using host name and not node name
cluster_formation.consul.svc_addr_use_nodename = false
# use long RabbitMQ node names?
cluster_formation.consul.use_longname = true

In the example below, the service address is computed by taking the IP address of a provided NIC, en0:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local

cluster_formation.consul.svc = rabbitmq
# do compute service address
cluster_formation.consul.svc_addr_auto = true
# compute service address using the IP address of a NIC, en0
cluster_formation.consul.svc_addr_nic = en0
cluster_formation.consul.svc_addr_use_nodename = false
# use long RabbitMQ node names?
cluster_formation.consul.use_longname = true

Service port as registered in Consul can be overridden. This is only necessary if RabbitMQ uses a non-standard port for client (technically AMQP 0-9-1 and AMQP 1.0) connections since default value is 5672.

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local
# 5672 is used by default
cluster_formation.consul.svc_port = 6674

When a node registers with Consul, it will set up a periodic health check for itself. Online nodes will periodically send a health check update to Consul to indicate the service is available. This interval can be configured:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local
# health check interval (node TTL) in seconds
# default: 30
cluster_formation.consul.svc_ttl = 40
A node that failed its health check is considered to be in the warning state by Consul. Such nodes can be automatically unregistered by Consul after a period of time (note: this is a separate interval value from the TTL above). The period cannot be less than 60 seconds.
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local
# health check interval (node TTL) in seconds
cluster_formation.consul.svc_ttl = 30
# how soon should nodes that fail their health checks be unregistered by Consul?
# this value is in seconds and must not be lower than 60 (a Consul requirement)
cluster_formation.consul.deregister_after = 90
Please see a section on automatic cleanup of nodes below.

Nodes in the warning state are excluded from peer discovery results by default. It is possible to opt into including them by setting cluster_formation.consul.include_nodes_with_warnings to true:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local
# health check interval (node TTL) in seconds
cluster_formation.consul.svc_ttl = 30
# include node in the warning state into discovery result set
cluster_formation.consul.include_nodes_with_warnings = true

If node name is computed and long node names are used, it is possible to append a suffix to node names retrieved from Consul. The format is .node.{domain_suffix}. This can be useful in environments with DNS conventions, e.g. when all service nodes are organised in a separate subdomain. Here's an example:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul

cluster_formation.consul.host = consul.eng.example.local

cluster_formation.consul.svc = rabbitmq
# do compute service address
cluster_formation.consul.svc_addr_auto = true
# compute service address using node name
cluster_formation.consul.svc_addr_use_nodename = true
# use long RabbitMQ node names?
cluster_formation.consul.use_longname = true
# append a suffix (node.rabbitmq.example.local) to node names retrieved from Consul
cluster_formation.consul.domain_suffix = example.local
With this setup node names will be computed to [email protected] instead of [email protected].

Peer Discovery Using etcd

An etcd-based discovery mechanism is available via a plugin. etcd v3 and v2 are supported.

Nodes register with etcd on boot by creating a key in a conventionally named directory. The keys have a short (say, a minute) expiration period. The keys are deleted when nodes stop cleanly. Prior to registration, nodes will attempt to acquire a lock in etcd to reduce the probability of a race condition during initial cluster formation.

Nodes contact etcd periodically to refresh their keys. Those that haven't done so in a configurable period of time (node TTL) are cleaned up from etcd. If configured, such nodes can be forcefully removed from the cluster.

To use etcd for peer discovery, set the cluster_formation.peer_discovery_backend to rabbit_peer_discovery_etcd and provide an etcd node hostname for the plugin to connect to:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_etcd

# etcd host (hostname or IP address). This property is required or peer discovery won't be performed.
cluster_formation.etcd.host = etcd.eng.example.local

It is possible to configure etcd port and URI scheme:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_etcd

cluster_formation.etcd.host = etcd.eng.example.local
# 2379 is used by default
cluster_formation.etcd.port = 2379
# http is used by default
cluster_formation.etcd.scheme = http

Directories and keys used by the peer discovery mechanism follow a naming scheme:

/v2/keys/{key_prefix}/{cluster_name}/nodes/{node_name}
Here's an example of a key that would be used by node [email protected] with default key prefix and cluster name:
/v2/keys/rabbitmq/default/nodes/[email protected]
Default key prefix is simply "rabbitmq". It rarely needs overriding but that's supported:
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_etcd

cluster_formation.etcd.host = etcd.eng.example.local
# rabbitmq is used by default
cluster_formation.etcd.key_prefix = rabbitmq_discovery

If multiple RabbitMQ clusters share an etcd installation, each cluster must use a unique name:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_etcd

cluster_formation.etcd.host = etcd.eng.example.local
# default name: "default"
cluster_formation.etcd.cluster_name = staging

Key used for node registration will have a TTL interval set for them. Online nodes will periodically refresh their key(s). The TTL value can be configured:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_etcd

cluster_formation.etcd.host = etcd.eng.example.local
# node TTL in seconds
# default: 30
cluster_formation.etcd.ttl = 40
Key refreshes will be performed every TTL/2 seconds. It is possible to forcefully remove the nodes that fail to refresh their keys from the cluster. This is covered later in this guide.

When a node tries to acquire a lock on boot and the lock is already taken, it will wait for the lock to be come available with a timeout. Default value is 300 seconds but it can be configured:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_etcd

cluster_formation.etcd.host = etcd.eng.example.local
# lock acquisition timeout in seconds
# default: 300
cluster_formation.etcd.lock_wait_time = 60

Race Conditions During Initial Cluster Formation

Consider a deployment where the entire cluster is provisioned at once and all nodes start in parallel. In this case there's a natural race condition between node registration and more than one node can become "first to register" (discovers no existing peers and thus starts as standalone).

Different peer discovery backends use different approaches to minimize the probability of such scenario. Some use locking (etcd, Consul), others use a technique known as randomized startup delay. With randomized startup delay nodes will delay their startup for a randomly picked value (between 5 and 60 seconds by default).

Some backends (config file, DNS) rely on a pre-configured set of peers and avoid the issue that way.

Effective delay interval, if used, is logged on node boot.

Lastly, some mechanism rely on ordered node startup provided by the underlying provisioning and orchestration tool. Kubernetes stateful sets is one example of an environment that offers such a guarantee.

Node Health Checks and Cleanup

Sometimes a node is a cluster member but not known to the discovery backend. For example, consider a cluster that uses the AWS backend configured to use autoscaling group membership. If an EC2 instance in that group fails and is later re-created, it will be considered an unavailable node in the RabbitMQ cluster. With some peer discovery backends such unknown nodes can be logged or forcefully removed from the cluster. They are

Forced node removal can be dangerous and should be carefully considered. For example, a node that's temporarily unavailable but will be rejoining (or recreated with its persistent storage re-attached from its previous incarnation) can be kicked out of the cluster permanently by automatic cleanup, thus failing to rejoin.

Before enabling the configuration keys covered below make sure that a compatible peer discovery plugin is enabled. If that's not the case the node will report the settings to be unknown and will fail to start.

To log warnings for the unknown nodes, cluster_formation.node_cleanup.only_log_warning should be set to true:

# Don't remove cluster members unknown to the peer discovery backend but log
# warnings.
#
# This setting can only be used if a compatible peer discovery plugin is enabled.
cluster_formation.node_cleanup.only_log_warning = true
This is the default behavior.

To forcefully delete the unknown nodes from the cluster, cluster_formation.node_cleanup.only_log_warning should be set to false.

# Forcefully remove cluster members unknown to the peer discovery backend. Once removed,
# the nodes won't be able to rejoin. Use this mode with great care!
#
# This setting can only be used if a compatible peer discovery plugin is enabled.
cluster_formation.node_cleanup.only_log_warning = false
Note that this option should be used with care, in particular with discovery backends other than AWS.

Some backends (Consul, etcd) support node health checks (or TTL). Nodes periodically notify their respective discovery service (e.g. Consul) that they are still available. If no notifications from a node come in after a period of time, the node is considered to be in the warning state. With etcd, such nodes will no longer show up in discovery results. With Consul, they can either be removed (deregistered) or their warning state can be reported. Please see documentation for those backends to learn more.

Automatic cleanup of absent nodes makes most sense in environments where failed/discontinued nodes will be replaced with brand new ones (including cases when persistent storage won't be re-attached).

When automatic node cleanup is disabled (switched to the warning mode), operators have to explicitly remove absent cluster nodes using CLI tools.

Troubleshooting

The peer discovery subsystem and individual mechanism implementations log important discovery procedure steps at the info log level. More extensive logging is available at the debug level. Mechanisms that depend on external services accessible over HTTP will log all outgoing HTTP requests and response codes at debug level. See the logging guide for more information about logging configuration.

If the log does not contain any entries that demonstrate peer discovery progress, for example, the list of nodes retrieved by the mechanism or clustering attempts, it may mean that the node already has an initialised data directory or is already a member of the cluster. In those cases peer discovery won't be performed.

Peer discovery relies on inter-node network connectivity and successful authentication via a shared secret. Verifying that nodes can communicate with one another and use the expected Erlang cookie value (that's also identical across all cluster nodes). See the main Clustering guide for more information.

A methodology for network connectivty troubleshooting as well as commonly used tools are covered in the Troubleshooting Network Connectivity guide.

Getting Help and Providing Feedback

If you have questions about the contents of this guide or any other topic related to RabbitMQ, don't hesitate to ask them on the RabbitMQ mailing list.

Help Us Improve the Docs <3

If you'd like to contribute an improvement to the site, its source is available on GitHub. Simply fork the repository and submit a pull request. Thank you!