Menu

Runtime Tuning

Overview

RabbitMQ runs on the Erlang virtual machine and runtime. A compatible version of Erlang must be installed in order to run RabbitMQ.

The Erlang runtime includes a number of components used by RabbitMQ. The most important ones as far as this guide is concerned are

This guide will focus on the virtual machine. For an overview of epmd, please refer to the Networking guide.

Topics covered include:

VM Settings

The Erlang VM has a broad range of options that can be configured that cover process scheduler settings, memory allocation, garbage collection, I/O, and more. Tuning of those flags can significantly change runtime behavior of a node.

Configuring Flags

Most of the settings can be configured using environment variables. A few settings have dedicated variables, others can only be changed using the following generic variables that control what flags are passed by RabbitMQ startup scripts to the Erlang virtual machine.

The generic variables are

  • RABBITMQ_SERVER_ERL_ARGS allows all VM flags to be overridden, including the defaults set by RabbitMQ scripts
  • RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS allows a set of flags to be appended to the defaults set by RabbitMQ scripts
  • RABBITMQ_CTL_ERL_ARGS controls CLI tool VM flags

In most cases RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS is the recommended option. It can be used to override defaults in a safe manner. For example, if an important flag is omitted from RABBITMQ_SERVER_ERL_ARGS, runtime performance characteristics or system limits can be unintentionally affected.

As with other environment variables used by RabbitMQ, RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS and friends can be set using a separate environment variable file.

Runtime Schedulers

Schedulers in the runtime assign work to kernel threads that perform it. They execute code, perform I/O, execute timers and so on. Schedulers have a number of settings that can affect overall system performance, CPU utilisation, latency and other runtime characteristics of a node.

By default the runtime will start one scheduler for one CPU core it detects. This can be changed using the +S flag. The following example configures the node to start 4 schedulers even if it detects more:

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+S 4:4"

Most of the time the default behaviour works well. In shared or CPU constrained environments (including containerised ones), explicitly configuring scheduler count may be necessary.

CPU Resource Contention

The runtime assumes that it does not share CPU resources with other tools or tenants. When that's the case, the scheduling mechanism used can become very inefficient and result in significant (up to several orders of magnitude) latency increase for certain operations.

This means that in most cases colocating RabbitMQ nodes with other tools or applying CPU time slicing is highly discourage and will result in suboptimal performance.

Scheduler-to-CPU Core Binding

The number of schedulers won't always match the number of CPU cores available and the number of CPU cores does not necessarily correlate to the number of hardware threads (due to hyperthreading, for example). As such the runtime has to decide how to bind scheduler binding to hardware threads, CPU cores and NUMA nodes.

There are several binding strategies available. Desired strategy can be specified using the RABBITMQ_SCHEDULER_BIND_TYPE environment variable or using the +stbt VM flag value.

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+stbt nnts"
RABBITMQ_SCHEDULER_BIND_TYPE="nnts"

Note that the strategy will only be effective if the runtime can detect CPU topology in the given environment.

Valid values are:

  • db (used by default, alias for tnnps in current Erlang release series)
  • tnnps
  • nnts
  • nnps
  • ts
  • ps
  • s
  • ns

See VM flag documentation for more detailed descriptions.

Scheduler Wakeup Threshold

It is possible to make schedulers that currently do not have work to do using the +sbwt flag:

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+sbwt none"

The value of none can reduce CPU usage on systems that have a large number of mostly idle connections.

Memory Allocator Settings

The runtime manages (allocates and releases) memory. Runtime memory management is a complex topic with many tunable parameters. This section only covers the basics.

Memory is allocated in blocks from areas larger pre-allocated areas called carriers. Settings that control carrier size, block size, memory allocation strategy and so on are commonly referred to as allocator settings.

Depending on the allocator settings used and the workload, RabbitMQ can experience memory fragmentation of various degrees. Finding the best fit for your workload is a matter of trial, measurement (metric collection) and error. Note that some degree of fragmentation is inevitable.

Here are the allocator arguments used by default:

RABBITMQ_DEFAULT_ALLOC_ARGS="+MBas ageffcbf +MHas ageffcbf +MBlmbcs 512 +MHlmbcs 512 +MMmcs 30"

Instead of overriding RABBITMQ_DEFAULT_ALLOC_ARGS, add flags that should be overridden to RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS. They will take precedence over the default ones. So a node started with the following RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS value

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+MHlmbcs 8192"

will use in the following effective allocator settings:

"+MBas ageffcbf +MHas ageffcbf +MBlmbcs 512 +MHlmbcs 8192 +MMmcs 30"

For some workloads a larger preallocated area reduce allocation rate and memory fragmentation. To configure the node to use a preallocated area of 1 GB, add +MMscs 4096 to VM startup arguments using RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS:

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+MMscs 1024"

The value is in MB. The following example will preallocate a larger, 4 GB area:

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+MMscs 4096"

To learn about other available settings, see runtime documentation on allocators.

Inter-node Communication Buffer Size

Inter-node traffic between a pair of nodes uses a TCP connection with a buffer known as the inter-node communication buffer. Its size is 128 MB by default. This is a reasonable default for most workloads. In some environments inter-node traffic can be very heavy and run into the buffer's capacity. Other workloads where the default is not a good fit involve transferring very large (say, in hundreds of megabytes) messages that do not fit into the buffer.

In this case the value can be increased using the RABBITMQ_DISTRIBUTION_BUFFER_SIZE environment variable or the +zdbbl VM flag. The value is in kilobytes:

RABBITMQ_DISTRIBUTION_BUFFER_SIZE=192000
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+zdbbl 192000"

When the buffer is hovering around full capacity, nodes will log a warning mentioning a busy distribution port (busy_dist_port):

2019-04-06 22:48:19.031 [warning] <0.242.0> rabbit_sysmon_handler busy_dist_port <0.1401.0>

Increasing buffer size may help increase thoughput and/or reduce latency.

I/O Thread Pool Size

The runtime uses a pool of threads for performing I/O operations asynchronously. The size of the pool is configured via the RABBITMQ_IO_THREAD_POOL_SIZE environment variable. The variable is a shortcut to setting the +A VM command line flag, e.g. +A 128.

# reduces number of I/O threads from 128 to 32
RABBITMQ_IO_THREAD_POOL_SIZE=32

To set the flag directly, use the RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS environment variable:

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+A 128"

Default value in recent RabbitMQ releases is 128 (30 previously). Nodes that have 8 or more cores available are recommended to use values higher than 96, that is, 12 or more I/O threads for every core available.

Note that higher values do not necessarily mean better throughput or lower CPU burn due to waiting on I/O. There are relevant metrics available about runtime thread activity. This is covered in a separate section

Thread Statistics

RabbitMQ CLI tools provide a number of metrics that make it easier to reason about runtime thread activity.

rabbitmq-diagnostics runtime_thread_stats

is the command that produces a breakdown of how various threads spend their time.

The command's output will produce a table with percentages by thread activity:

  • emulator: general code execution
  • port: external I/O activity (socket I/O, file I/O, subprocesses)
  • gc: performing garbage collection
  • check_io: checking for I/O events
  • other, aux: busy waiting, managing timers, all other tasks
  • sleep: sleeping (idle state)

Significant percentage of activity in the external I/O state may indicate that the node and/or clients have maxed out network link capacity. This can be confirmed by infrastructure metrics.

Significant percent of activity in the sleeping state might indicate a lighly loaded node or suboptimal runtime schduler configuration for the available hardware and workload.

Getting Help and Providing Feedback

If you have questions about the contents of this guide or any other topic related to RabbitMQ, don't hesitate to ask them on the RabbitMQ mailing list.

Help Us Improve the Docs <3

If you'd like to contribute an improvement to the site, its source is available on GitHub. Simply fork the repository and submit a pull request. Thank you!