Menu

Networking and RabbitMQ

Introduction

Clients communicate with RabbitMQ over the network. All protocols supported by the broker are TCP-based. Both RabbitMQ and the operating system provide a number of knobs that can be tweaked. Some of them are directly related to TCP and IP operations, others have to do with application-level protocols such as TLS. This guide covers multiple topics related to networking in the context of RabbitMQ. This guide is not meant to be an extensive reference but rather an overview. Some tuneable parameters discussed are OS-specific. This guide focuses on Linux when covering OS-specific subjects, as it is the most common platform RabbitMQ is deployed on.

There are several areas which can be configured or tuned:

  • Interfaces and ports
  • TLS
  • TCP socket settings (e.g. buffer size)
  • Kernel TCP settings (e.g. TCP keepalives)
  • (AMQP 0-9-1, STOMP) Heartbeats, known as keepalives in MQTT
  • Hostnames and DNS
Except for OS kernel parameters and DNS, all RabbitMQ settings are configured via RabbitMQ configuration file(s).

Networking is a broad topic. There are many configuration options that can have positive or negative effect on certain workloads. As such, this guide does not try to be a complete reference but rather offer an index of key tunable parameters and serve as a starting point.

Network Interfaces

For RabbitMQ to accept client connections, it needs to bind to one or more interfaces and listen on (protocol-specific) ports. The interfaces are configured using the rabbit.tcp_listeners config option. By default, RabbitMQ will listen on port 5672 on all available interfaces.

TCP listeners configure both interface and port. The following example demonstrates how to configure RabbitMQ on a specific IP and standard port:

[
  {rabbit, [
    {tcp_listeners, [{"192.168.1.99", 5672}]}
  ]}
].

Listening on Dual Stack (Both IPv4 and IPv6) Interfaces

The following example demonstrates how to configure RabbitMQ to listen on localhost only for both IPv4 and IPv6:

[
  {rabbit, [
    {tcp_listeners, [{"127.0.0.1", 5672},
                     {"::1",       5672}]}
  ]}
].

With modern Linux kernels and Windows versions after Vista, when a port is specified and RabbitMQ is configured to listen on all IPv6 addresses but IPv4 is not disabled explicitly, IPv4 address will be included, so

[
  {rabbit, [
    {tcp_listeners, [{"::",       5672}]}
  ]}
].
is equivalent to
[
  {rabbit, [
    {tcp_listeners, [{"0.0.0.0", 5672},
                     {"::",      5672}]}
  ]}
].

Listening on IPv4 Interfaces Only

In this example RabbitMQ will listen on an IPv4 interface only:

[
  {rabbit, [
    {tcp_listeners, [{"192.168.1.99", 5672}]}
  ]}
].

Alternatively, if a single stack setup is desired, the interface can be configured using the RABBITMQ_NODE_IP environment variable. See our Configuration guide for detalis.

Listening on IPv6 Interfaces Only

In this example RabbitMQ will listen on an IPv6 interface only:

[
  {rabbit, [
    {tcp_listeners, [{"fe80::2acf:e9ff:fe17:f97b", 5672}]}
  ]}
].

Alternatively, if a single stack setup is desired, the interface can be configured using the RABBITMQ_NODE_IP environment variable. See our Configuration guide for detalis.

Port Access

SELinux, and similar mechanisms may prevent RabbitMQ from binding to a port. When that happens, RabbitMQ will fail to start. Firewalls can prevent nodes and CLI tools from communicating with each other. Make sure the following ports can be opened:

  • 4369: epmd, a peer discovery service used by RabbitMQ nodes and CLI tools
  • 5672, 5671: used by AMQP 0-9-1 and 1.0 clients without and with TLS
  • 25672: used by Erlang distribution for inter-node and CLI tools communication and is allocated from a dynamic range (limited to a single port by default, computed as AMQP port + 20000). See networking guide for details.
  • 15672: HTTP API clients and rabbitmqadmin (only if the management plugin is enabled)
  • 61613, 61614: STOMP clients without and with TLS (only if the STOMP plugin is enabled)
  • 1883, 8883: (MQTT clients without and with TLS, if the MQTT plugin is enabled
  • 15674: STOMP-over-WebSockets clients (only if the Web STOMP plugin is enabled)
  • 15675: MQTT-over-WebSockets clients (only if the Web MQTT plugin is enabled)
It is possible to configure RabbitMQ to use different ports and specific network interfaces.

EPMD and Inter-node Communication Port(s)

Erlang makes use of a Port Mapper Daemon (epmd) for resolution of node names in a cluster. The default epmd port is 4369, but this can be changed using the ERL_EPMD_PORT environment variable. All nodes must use the same port. For further details see the Erlang epmd manpage.

Once a distributed Erlang node address has been resolved via epmd, other nodes will attempt to communicate directly with that address using the Erlang distribution protocol. See the following section for details.

RabbitMQ nodes communicate with CLI tools and other nodes using a port known as the distribution port. It is dynamically allocated from a range of values. By default the range is limited to a single value computed as configured RABBITMQ_NODE_PORT (AMQP port) + 20000, or 25672. This single port range can be configured using the RABBITMQ_DIST_PORT environment variable.

The range can also be controlled via two configuration keys:

  • kernel.inet_dist_listen_min
  • kernel.inet_dist_listen_max
They define the range's lower and upper bounds, inclusive.

The example below uses a range with a single port but a value different from default:

[
  {kernel, [
    {inet_dist_listen_min, 33672},
    {inet_dist_listen_max, 33672}
  ]},
  {rabbit, [
    ...
  ]}
].

To verify what port is used by a node for inter-node and CLI tool communication, run

epmd -names
on that node's host. It will output that looks like this:
epmd: up and running on port 4369 with data:
name rabbit at port 25672

TLS (SSL) Support

It is possible to encrypt connections using TLS with RabbitMQ. Authentication using peer certificates is also possible. Please refer to the TLS/SSL guide for more information.

Tuning for Throughput

Tuning for throughput is a common goal. Improvements can be achieved by

  • Increasing TCP buffer sizes
  • Ensuring Nagle's algorithm is disabled
  • Enabling optional TCP features and extensions
For the latter two, see the OS-level tuning section below. Note that tuning for throughput will involve trade-offs. For example, increasing TCP buffer sizes will increase the amount of RAM used by every connection, which can be a significant total server RAM use increase.

TCP Buffer Size

This is one of the key tunable parameters. Every TCP connection has buffers allocated for it. Generally speaking, the larger these buffers are, the more RAM is used per connection and better the throughput. On Linux, the OS will automatically tune TCP buffer size by default, typically settling on a value between 80 and 120 KB. For maximum throughput, it is possible to increase buffer size using the rabbit.tcp_listen_options, rabbitmq_mqtt.tcp_listen_options, rabbitmq_amqp1_0.tcp_listen_options, and related config keys.

The following example sets TCP buffers for AMQP 0-9-1 connections to 192 KiB:

[
  {rabbit, [
    {tcp_listen_options, [
                          {backlog,       128},
                          {nodelay,       true},
                          {linger,        {true,0}},
                          {exit_on_close, false},
                          {sndbuf,        196608},
                          {recbuf,        196608}
                         ]}
  ]}
].
The same example for MQTT and STOMP connections:
[
  {rabbitmq_mqtt, [
    {tcp_listen_options, [
                          {backlog,       128},
                          {nodelay,       true},
                          {linger,        {true,0}},
                          {exit_on_close, false},
                          {sndbuf,        196608},
                          {recbuf,        196608}
                         ]}
                         ]},
  {rabbitmq_stomp, [
    {tcp_listen_options, [
                          {backlog,       128},
                          {nodelay,       true},
                          {linger,        {true,0}},
                          {exit_on_close, false}
                          {sndbuf,        196608},
                          {recbuf,        196608}
                         ]}
  ]}
].
Note that setting send and receive buffer sizes to different values is dangerous and is not recommended.

Erlang VM I/O Thread Pool

Erlang runtime uses a pool of threads for performing I/O operations asynchronously. The size of the pool is configured via the +A VM command line flag, e.g. +A 128. We highly recommend overriding the flag using the `RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS` environment variable:

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+A 128"
Default value in recent RabbitMQ releases is 128 (30 previously). Nodes that have 8 or more cores available are recommended to use values higher than 96, that is, 12 or more I/O threads for every core available. Note that higher values do not necessarily mean better throughput or lower CPU burn due to waiting on I/O.

Tuning for a Large Number of Connections

Some workloads, often referred to as "the Internet of Things", assume a large number of client connections per node, and a relatively low volume of traffic from each node. One such workload is sensor networks: there can be hundreds of thousands or millions of sensors deployed, each emitting data every several minutes. Optimising for the maximum number of concurrent clients can be more important than for total throughput.

Several factors can limit how many concurrent connections a single node can support:

  • Maximum number of open file handles (including sockets) as well as other kernel-enforced resource limits
  • Amount of RAM used by each connection
  • Amount of CPU resources used by each connection
  • Maximum number of Erlang processes the VM is configured to allow

Open File Handle Limit

Most operating systems limit the number of file handles that can be opened at the same time. When an OS process (such as RabbitMQ's Erlang VM) reaches the limit, it won't be able to open any new files or accept any more TCP connections.

How the limit is configured varies from OS to OS and distribution to distribution, e.g. depending on whether systemd is used. For Linux, Controlling System Limits on Linux in our Debian and RPM installation guides provides. Linux kernel limit management is covered by many resources on the Web, including the open file handle limit. MacOS uses a similar system.

When optimising for the number of concurrent connections, making sure your system has enough file descriptors to support not only client connections but also files the node may use. To calculate a ballpark limit, multiply the number of connections per node by 1.5. For example, to support 100,000 connections, set the limit to 150,000. Increasing the limit slightly increases the amount of RAM idle machine uses but this is a reasonable trade-off.

TCP Buffer Size

See the section above for an overview. It is possible to decrease buffer size using the rabbit.tcp_listen_options, rabbitmq_mqtt.tcp_listen_options, rabbitmq_amqp1_0.tcp_listen_options, and related config keys to reduce the amount of RAM by the server used per connection. This is often necessary in environments where the number of concurrent connections sustained per node is more important than throughput.

The following example sets TCP buffers for AMQP 0-9-1 connections to 32 KiB:

[
  {rabbit, [
    {tcp_listen_options, [
                          {backlog,       128},
                          {nodelay,       true},
                          {linger,        {true,0}},
                          {exit_on_close, false},
                          {sndbuf,        32768},
                          {recbuf,        32768}
                         ]}
  ]}
].
The same example for MQTT and STOMP connections:
[
  {rabbitmq_mqtt, [
    {tcp_listen_options, [
                          {backlog,       128},
                          {nodelay,       true},
                          {linger,        {true,0}},
                          {exit_on_close, false},
                          {sndbuf,        32768},
                          {recbuf,        32768}
                         ]}
                         ]},
  {rabbitmq_stomp, [
    {tcp_listen_options, [
                          {backlog,       128},
                          {nodelay,       true},
                          {linger,        {true,0}},
                          {exit_on_close, false},
                          {sndbuf,        32768},
                          {recbuf,        32768}
                         ]}
  ]}
].
Note that lower TCP buffer sizes will result in a significant throughput drop, so an optimal value between throughput and per-connection RAM use needs to be found for every workload. Setting send and receive buffer sizes to different values is dangerous and is not recommended. Values lower than 8 KiB are not recommended.

Nagle's Algorithm ("nodelay")

Disabling Nagle's algorithm is primarily useful for reducing latency but can also improve throughput. kernel.inet_default_connect_options and kernel.inet_default_listen_options must include {nodelay, true} to disable Nagle's algorithm for inter-node connections. When configuring sockets that serve client connections, rabbit.tcp_listen_options must include the same option. This is the default. The following example demonstrates that:

[
  {kernel, [
    {inet_default_connect_options, [{nodelay, true}]},
    {inet_default_listen_options,  [{nodelay, true}]}
  ]},
  {rabbit, [
    {tcp_listen_options, [
                          {backlog,       4096},
                          {nodelay,       true},
                          {linger,        {true,0}},
                          {exit_on_close, false}
                         ]}
  ]}
].

Erlang VM I/O Thread Pool Tuning

Adequate Erlang VM I/O thread pool size is also important when tuning for a large number of concurrent connections. See the section above.

Connection Backlog

With a low number of clients, new connection rate is very unevenly distributed but is also small enough to not make much difference. When the number reaches tens of thousands or more, it is important to make sure that the server can accept inbound connections. Unaccepted TCP connections are put into a queue with bounded length. This length has to be sufficient to account for peak load hours and possible spikes, for instance, when many clients disconnect due to a network interruption or choose to reconnect. This is configured using the rabbit.tcp_listen_options.backlog option:

[
  {rabbit, [
    {tcp_listen_options, [
                          {backlog,       4096},
                          {nodelay,       true},
                          {linger,        {true,0}},
                          {exit_on_close, false}
                         ]}
  ]}
].
Default value is 128. When pending connection queue length grows beyond this value, connections will be rejected by the operating system. See also net.core.somaxconn in the kernel tuning section.

OS Level Tuning

Operating system settings can affect operation of RabbitMQ. Some are directly related to networking (e.g. TCP settings), others affect TCP sockets as well as other things (e.g. open file handles limit). Understanding these limits is important, as they may change depending on the workload.

A few important configurable kernel options include (for IPv4):

fs.file-max
Max number of files the kernel will allocate. Limits and current value can be inspected using /proc/sys/fs/file-nr.
net.ipv4.ip_local_port_range
Local IP port range, define as a pair of values. The range must provide enough entries for the peak number of concurrent connections.
net.ipv4.tcp_tw_reuse
When enabled, allows the kernel to reuse sockets in TIME_WAIT state when it's safe to do so. See Coping with the TCP TIME_WAIT connections on busy servers for details. This option is dangerous when used behind NAT.
net.ipv4.tcp_fin_timeout
Lowering this value to 5-10 reduces the amount of time closed connections will stay in the TIME_WAIT state. Recommended for cases when a large number of concurrent connections is expected.
net.core.somaxconn
Size of the listen queue (how many connections are in the process of being established at the same time). Default is 128. Increase to 4096 or higher to support inbound connection bursts, e.g. when clients reconnect en masse.
net.ipv4.tcp_max_syn_backlog
Maximum number of remembered connection requests which did not receive an acknowledgment yet from connecting client. Default is 128, max value is 65535. 4096 and 8192 are recommended starting values when optimising for throughput.
net.ipv4.tcp_keepalive_*
net.ipv4.tcp_keepalive_time, net.ipv4.tcp_keepalive_intvl, and net.ipv4.tcp_keepalive_probes configure TCP keepalive. AMQP 0-9-1 and STOMP have Heartbeats which partially undo its effect, namely that it can take minutes to detect an unresponsive peer, e.g. in case of a hardware or power failure. MQTT also has its own keepalives mechanism which is the same idea under a different name. When enabling TCP keepalive with default settings, we recommend setting heartbeat timeout to 8-20 seconds. Also see a note on TCP keepalives later in this guide.
net.ipv4.conf.default.rp_filter
Enabled reverse path filtering. If IP address spoofing is not a concern for your system, disable it.
Note that default values for these vary between Linux kernel releases and distributions. Using a recent kernel (3.9 or later) is recommended.

Kernel parameter tuning differs from OS to OS. This guide focuses on Linux. To configure a kernel parameter interactively, use sysctl -w (requires superuser privileges), for example:

sysctl -w fs.file-max 200000
To make the changes permanent (stick between reboots), they need to be added to /etc/sysctl.conf. See sysctl(8) and sysctl.conf(5) for more details.

TCP stack tuning is a broad topic that is covered in much detail elsewhere:

TCP Socket Options

Common Options

rabbit.tcp_listen_options.nodelay
When set to true, disables Nagle's algorithm. Default is true. Highly recommended for most users.
rabbit.tcp_listen_options.sndbuf
See TCP buffers discussion earlier in this guide. Default value is automatically tuned by the OS, typically in the 88 KiB to 128 KiB range on modern Linux versions. Increasing buffer size improves consumer throughput and RAM use for every connection. Decreasing has the opposite effect.
rabbit.tcp_listen_options.recbuf
See TCP buffers discussion earlier in this guide. Default value effects are similar to that of rabbit.tcp_listen_options.sndbuf but for publishers and protocol operations in general.
rabbit.tcp_listen_options.backlog
Maximum size of the unaccepted TCP connections queue. When this size is reached, new connections will be rejected. Set to 4096 or higher for environments with thousands of concurrent connections and possible bulk client reconnections.
rabbit.tcp_listen_options.linger
When set to {true, N}, sets the timeout in seconds for flushing unsent data when the (server) socket is closed.
rabbit.tcp_listen_options.keepalive
When set to true, enables TCP keepalives (see above). Default is false. Makes sense for environments where connections can go idle for a long time (at least 10 minutes), although using heartbeats is still recommended over this option.

Defaults

Below is the default TCP socket option configuration used by RabbitMQ:

[
  {rabbit, [
    {tcp_listen_options, [{backlog,       128},
                          {nodelay,       true},
                          {linger,        {true, 0}},
                          {exit_on_close,  false}]
  ]}
].

Heartbeats

Some protocols supported by RabbitMQ, including AMQP 0-9-1, support heartbeats, a way to detect dead TCP peers quicker. Please refer to the Heartbeats guide for more information.

Net Tick Time

Heartbeats are used to detect peer or connection failure between clients and RabbitMQ nodes. net_ticktime serves the same purpose but for cluster node communication. Values lower than 5 (seconds) may result in false positive and are not recommended.

TCP Keepalives

TCP contains a mechanism similar in purpose to the heartbeat (a.k.a. keepalive) one in messaging protocols and net tick timeout covered above: TCP keepalives. Due to inadequate defaults, TCP keepalives often don't work the way they are supposed to: it takes a very long time (say, an hour or more) to detect a dead peer. However, with tuning they can serve the same purpose as heartbeats and clean up stale TCP connections e.g. with clients that opted to not use heartbeats, intentionally or not. Below is an example sysctl configuration for TCP keepalives that considers TCP connections dead or unreachable after 120 seconds (4 attempts every 15 seconds after connection idle for 60 seconds):

net.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_intvl=15
net.ipv4.tcp_keepalive_probes=4
TCP keepalives can be a useful additional defense mechanism in environments where RabbitMQ operator has no control over application settings or client libraries used.

Connection Handshake Timeout

RabbitMQ has a timeout for connection handshake, 10 seconds by default. When clients run in heavily constrained environments, it may be necessary to increase the timeout. This can be done via the rabbit.handshake_timeout (in milliseconds):

[
  {rabbit, [
    %% 20 seconds
    {handshake_timeout, 20000}
  ]}
].
It should be pointed out that this is only necessary with very constrained clients and networks. Handshake timeouts in other circumstances indicate a problem elsewhere.

TLS/SSL Handshake

If TLS/SSL is enabled, it may necessary to increase also the TLS/SSL handshake timeout. This can be done via the rabbit.ssl_handshake_timeout (in milliseconds):

[
  {rabbit, [
    %% 10 seconds
    {ssl_handshake_timeout, 10000}
  ]}
].

Hostname Resolution and DNS

In many cases, RabbitMQ relies on the Erlang runtime for inter-node communication (including tools such as rabbitmqctl, rabbitmq-plugins, etc). Client libraries also perform hostname resolution when connecting to RabbitMQ nodes. This section briefly covers most common issues associated with that.

Performed by Client Libraries

If a client library is configured to connect to a hostname, it performs hostname resolution. Depending on DNS and local resolver (/etc/hosts and similar) configuration, this can take some time. Incorrect configuration may lead to resolution timeouts, e.g. when trying to resolve a local hostname such as my-dev-machine, over DNS. As a result, client connections can take a long time (from tens of seconds to a few minutes).

Short and Fully-qualified RabbitMQ Node Names

RabbitMQ relies on the Erlang runtime for inter-node communication. Erlang nodes include a hostname, either short (rmq1) or fully-qualified (rmq1.dev.megacorp.local). Mixing short and fully-qualified hostnames is not allowed by the runtime. Every node in a cluster must be able to resolve every other node's hostname, short or fully-qualified. By default RabbitMQ will use short hostnames. Set the RABBITMQ_USE_LONGNAME environment variable to make RabbitMQ nodes use fully-qualified names, e.g. rmq1.dev.megacorp.local.

Reverse DNS Lookups

If the rabbit.reverse_dns_lookups configuration option is set to true, RabbitMQ will perform reverse DNS lookups for client IP addresses and list hostnames in connection information (e.g. in the Management UI).