Menu

Reasoning About Memory Use

Introduction

Operators need to be able to reason about node's memory use, both absolute and relative ("what uses most memory"). This is an important aspect of system monitoring.

RabbitMQ provides tools that report and help analyse node memory use:

  • rabbitmqctl status provides a memory breakdown section
  • management UI provides the same breakdown on the node page as rabbitmqctl status
  • HTTP API provides the same information as the management UI, useful for monitoring
  • rabbitmq-top, a plugin inspired by the top utility

Obtaining a node memory breakdown should be the first step when reasoning about node memory use.

Note that all measurements are somewhat approximate, based on values returned by the underlying runtime or the kernel at a specific point in time, usually within a 5 seconds time window.

Total Memory Use Calculation Strategies

Starting with version 3.6.11, different strategies can be used to compute how much memory a node uses. Historically, nodes obtained this information from the runtime, reporting how much memory is used (not just allocated). This strategy, known as legacy (alias for erlang) tends to underreport and is not recommended.

Effective strategy is configured using the vm_memory_calculation_strategy key.

rss uses OS-specific means of querying the kernel to find RSS (Resident Set Size) value of the node OS process. This strategy is most precise and used by default on Linux, MacOS, BSD and Solaris systems. When this strategy is used, RabbitMQ runs short lived subprocesses once a second.

allocated is a strategy that queries runtime memory allocator information. It is usually quite close to the values reported by the rss strategy. This strategy is used by default on Windows.

The vm_memory_calculation_strategy setting also impacts memory breakdown reporting. If set to legacy (erlang) or allocated, some memory breakdown fields will not be reported. This is covered in more detail further in this guide.

The following config example uses the rss strategy:

vm_memory_calculation_strategy = rss
Similarly, for the allocated strategy, use:
vm_memory_calculation_strategy = allocated

To configure the rss strategy using classic config format:

[
  {rabbit, [{vm_memory_calculation_strategy, rss}]}
].
Similarly, for the allocated strategy, use:
[
  {rabbit, [{vm_memory_calculation_strategy, allocated}]}
].

To find out what strategy a node uses, see its effective configuration.

Memory Use Breakdown

How Memory Breakdown Works

Memory use breakdown reports allocated memory distribution by category:

  • Connections (further split into three categories: readers, writers, other)
  • Channels
  • Queue master replicas
  • Queue mirror replicas
  • Message Store and Indices
  • Binaries
  • Node-local metrics (stats database)
  • Internal database tables
  • Plugins
  • Memory allocated but not yet used
  • Code (bytecode, module metadata)
  • ETS (in memory key/value store) tables
  • Atom tables
  • Other
Generally there is no overlap between the categories (no double accounting for the same memory). Plugins and runtime versions may affect this.

Producing Memory Use Breakdown Using rabbitmqctl

A common way of producing memory breakdown is via rabbitmqctl status:

{memory,
    [{connection_readers,70896},
     {connection_writers,166752},
     {connection_channels,1239768},
     {connection_other,233336},
     {queue_procs,2941784},
     {queue_slave_procs,0},
     {plugins,4633344},
     {other_proc,21878696},
     {metrics,215544},
     {mgmt_db,1244248},
     {mnesia,79296},
     {other_ets,2299848},
     {binary,4660864},
     {msg_index,47880},
     {code,25423126},
     {atom,1041593},
     {other_system,22215713},
     {allocated_unused,28552208},
     {reserved_unallocated,0},
     {total,90398720}]}

Report Field Category Details
total Total amount as reported by the effective memory calculation strategy (see above)
connection_readers Connections Processes responsible for connection parser and most of connection state. Most of their memory attributes to TCP buffers. The more client connections a node has, the more memory will be used by this category. See Networking guide for more information.
connection_writers Connections Processes responsible for serialisation of outgoing protocol frames and writing to client connection sockets. The more client connections a node has, the more memory will be used by this category. See Networking guide for more information.
connection_channels Channels The more channels client connections use, the more memory will be used by this category.
connection_other Connections Other memory related to client connections
queue_procs Queues Queue masters, indices and messages kept in memory. The greater the number of messages enqueued, the more memory will generally be attributed to this section. However, this greatly depends on queue properties and whether messages were published as transient. See Memory, Queues, and Lazy Queues guides for more information.
queue_slave_procs Queues Queue mirrors, indices and messages kept in memory. Reducing the number of mirrors (replicas) or not mirroring queues with inherently transient data can reduce the amount of RAM used by mirrors. The greater the number of messages enqueued, the more memory will generally be attributed to this section. However, this greatly depends on queue properties and whether messages were published as transient. See Memory, Queues, Mirroring, and Lazy Queues guides for more information.
metrics Stats DB Node-local metrics. The more connections, channels, queues are node hosts, the more stats there are to collect and keep. See managemeng plugin guide for more information.
stats_db Stats DB Aggregated and pre-computed metrics, inter-node HTTP API request cache and everything else related to the stats DB. See managemeng plugin guide for more information.
binaries Binaries Runtime binary heap. Most of this section is usually message bodies and properties (metadata).
plugins Plugins Plugins such as Shovel, Federation, or protocol implementations such as STOMP can accumulate messages in memory.
allocated_unused Preallocated Memory Allocated by the runtime but not yet used.
reserved_unallocated Preallocated Memory Allocated/reserved by the kernel but not the runtime
mnesia Internal Database Virtual hosts, users, permissions, queue metadata and state, exchanges, bindings, runtime parameters and so on.
other_ets Internal Database Some plugins can use ETS tables to store their state
code Code Bytecode and module metadata. This should only consume double digit % of memory on blank/empty nodes.
other Other All other processes that RabbitMQ cannot categorise

Producing Memory Use Breakdown Using Management UI

Management UI can be used to produce a memory breakdown chart. This information is available on the node metrics page that can be accessed from Overview: Cluster node list in management UI

On the node metrics page, scroll down to the memory breakdown buttons: Node memory use breakdown buttons

Memory and binary heap breakdowns can be expensive to calculate and are produced on demand when the Update button is pressed: Node memory use breakdown chart

It is also possible to display a breakdown of binary heap use by various things in the system (e.g. connections, queues): Binary heap use breakdown chart

Producing Memory Use Breakdown Using HTTP API and curl

It is possible to produce memory use breakdown over HTTP API by issuing a GET request to the /api/nodes/{node}/memory endpoint.

curl -s -u guest:guest http://127.0.0.1:15672/api/nodes/[email protected]/memory |
  python -m json.tool

{
    "memory": {
        "atom": 1041593,
        "binary": 5133776,
        "code": 25299059,
        "connection_channels": 1823320,
        "connection_other": 150168,
        "connection_readers": 83760,
        "connection_writers": 113112,
        "metrics": 217816,
        "mgmt_db": 266560,
        "mnesia": 93344,
        "msg_index": 48880,
        "other_ets": 2294184,
        "other_proc": 27131728,
        "other_system": 21496756,
        "plugins": 3103424,
        "queue_procs": 2957624,
        "queue_slave_procs": 0,
        "total": 89870336
    }
}

It is also possible to retrieve a relative breakdown using the GET request to the /api/nodes/{node}/memory endpoint. Note that reported relative values are rounded to integers. This endpoint is intended to be used for relative comparison (identifying top contributing categories), not precise calculations.

curl -s -u guest:guest http://127.0.0.1:15672/api/nodes/[email protected]/memory/relative |
  python -m json.tool

{
    "memory": {
        "allocated_unused": 32,
        "atom": 1,
        "binary": 5,
        "code": 22,
        "connection_channels": 2,
        "connection_other": 1,
        "connection_readers": 1,
        "connection_writers": 1,
        "metrics": 1,
        "mgmt_db": 1,
        "mnesia": 1,
        "msg_index": 1,
        "other_ets": 2,
        "other_proc": 21,
        "other_system": 19,
        "plugins": 3,
        "queue_procs": 4,
        "queue_slave_procs": 0,
        "reserved_unallocated": 0,
        "total": 100
    }
}

Memory Breakdown Categories

Connections

This includes memory used by client connections (including Shovels and Federation links and channels, and outgoing ones (Shovels and Federation upstream links). Most of the memory is usually used by TCP buffers, which on Linux autotune to about 100 kB in size by default. TCP buffer size can be reduced at the cost of a proportional decrease in connection throughput. See the Networking guide for details.

Channels also consume RAM. By optimising how many channels applications use, that amount can be decreased. It is possible to cap the max number of channels on a connection using the channel_max configuration setting:

channel_max = 16
Note that some libraries and tools that build on top of RabbitMQ clients may implicitly require a certain number of channels. Finding an optimal value is usually a matter of trial and error.

Queues and Messages

Memory used by queues, queue indices, queue state. Messages enqueued will in part contribute to this category.

Queues will swap their contents out to disc when under memory pressure. The exact behavior of this depends on queue properties, whether clients publish messages as persistent or transient, and persistence configuration of the node.

Message bodies do not show up here but in Binaries.

Message Store Indexes

By default message store uses an in-memory index of all messages, including those paged out to disc. Plugins allow for replacing it with disk-based implementations.

Plugins

Memory used by plugins (apart from the Erlang client which is counted under Connections, and the management database which is counted separately). This category will include some per-connection memory here for protocol plugins such as STOMP and MQTT as well as messages enqueued by plugins such as Shovel and Federation.

Preallocated Memory

Memory preallocated by the runtime (VM allocators) but not yet used. This is covered in more detail below.

Internal Database

Internal database (Mnesia) tables keep an in-memory copy of all its data (even on disc nodes). Typically this will only be large when there are a large number of queues, exchanges, bindings, users or virtual hosts. Plugins can store data in the same database as well.

Management (Stats) Database

The stats database (if the management plugin is enabled). In a cluster, most stats are stored locally on the node. Cross-node requests needed to aggregate stats in a cluster can be cached. The cached data will be reported in this category.

Binaries

Memory used by shared binary data in the runtime. Most of this memory is message bodies and metadata.

Other ETS tables

Other in-memory tables besides those belonging to the stats database and internal database tables.

Code

Memory used by code (bytecode, module metadata). This section is usually fairly constant and relatively small (unless the node is entirely blank and stores no data).

Atoms

Memory used by atoms. Should be fairly constant.

Per-process Analysis with rabbitmq-top

rabbitmq-top is a plugin that helps identify runtime processes ("lightweight threads") that consume most memory or scheduler (CPU) time.

The plugin ships with RabbitMQ. Enable it with

[sudo] rabbitmq-plugins enable rabbitmq_top

The plugin adds new administrative tabs to the management UI. One tab displays top processes by one of the metrics:

  • Memory used
  • Reductions (unit of scheduler/CPU consumption)
  • Erlang mailbox length
  • For gen_server2 processes, internal operation buffer length
Top processes in rabbitmq-top

Second tab displays ETS (internal key/value store) tables. The tables can be sorted by the amount of memory used or number of rows: Top ETS tables in rabbitmq-top

Memory Use Monitoring

It is recommended that production systems monitor memory usage of all cluster nodes, ideally with a breakdown, together with infrastructure-level metrics. By correlating breakdown categories with other metrics, e.g. the number of concurrent connections or enqueued messages, it becomes possible to detect problems that stem from a application-specific behavior (e.g. connection leaks or ever growing queues without consumers).

Preallocated memory

Erlang memory breakdown reports only memory is currently being used, and not the memory that has been allocated for later use or reserved by the operating system. OS tools like ps can report more memory used than the runtime. This memory consists of allocated but not used, as well as unallocated but reserved by the OS. Both values depend on the OS and Erlang VM allocator settings and can fluctuate significantly.

Note that the sections depend on the vm_memory_calculation_strategy setting. If the strategy is set to erlang, unused memory will not be reported. If memory calculation strategy is set to allocated, memory reserved by OS will not be reported. Therefore rss is the strategy that provides most information from both the kernel and the runtime.

Runtime's memory allocator behavior can be tuned, please refer to erl and erts_alloc documentation.

Queue Memory

How much memory does a message use?

A message has multiple parts that use up memory:

  • Payload - >= 1 byte - variable size, typically few hundred bytes to a few hundred kilobytes
  • Protocol attributes - >= 0 bytes - variable size, contains headers, priority, timestamp, reply to, etc.
  • RabbitMQ metadata - >= 720 bytes - variable size, contains exchange, routing keys, message properties, persistence, redelivery status, etc.
  • RabbitMQ message ordering structure - 16 bytes

Messages with a 1KB payload will use up 2KB of memory once attributes and metadata is factored in.

Some messages can be stored on disk, but still have their metadata kept in memory.

How much memory does a queue use?

A message has multiple parts that use up memory:

A queue is an Erlang process. If a queue is mirrored, each mirror is a separate Erlang process.

Since a queues master is a single Erlang process, message ordering can be guaranteed. Multiple queues means multiple Erlang processes which get an even amount of CPU time. This ensures that no queue can block other queues.

The memory use of a single queue can be obtained via the HTTP API:

curl -s -u guest:guest http://127.0.0.1:15672/api/queues/%2f/queue-name |
  python -m json.tool

{
    ..
    "memory": 97921904,
    ...
    "message_bytes_ram": 2153429941,
    ...
}

  • memory - memory used by the queue process, accounts message metadata (at least 720 bytes per message), does not account for message payloads over 64 bytes
  • message_bytes_ram - memory used by the message payloads, regardless of size

If messages are small, message metadata can use more memory than the message payload. 10,000 messages with 1 byte of payload will use 10KB of message_bytes_ram (payload) & 7MB of memory (metadata).

If message payloads are large, they will not be reflected in the queue process memory. 10,000 messages with 100 KB of payload will use 976MB of message_bytes_ram (payload) & 7MB of memory (metadata).

Why does the queue memory grow and shrink when publishing/consuming?

Erlang uses generational garbage collection for each Erlang process. Garbage collection is done per queue, independently of all other Erlang processes.

When garbage collection runs, it will copy used process memory before deallocating unused memory. This can can lead to the queue process using up to twice as much memory during garbage collection, as shown here (queue contains a lot of messages): Queue under load memory usage

Can queue memory growth during garbage collection be dangerous?

If Erlang VM tries to allocate more memory than is available, the VM itself will either crash or be killed by the OOM killer. When the Erlang VM crashes, RabbitMQ will lose all non-persistent data.

High memory watermark blocks publishers and prevents new messages from being enqueued. Since garbage collection can double the memory used by a queue, it is unsafe to set the high memory watermark above 0.5. The default high memory watermark is set to 0.4 since this is safer as not all memory is used by queues. This is entirely workload specific, which differs across RabbitMQ deployments.

We recommend many queues so that memory allocation / garbage collection is spread across many Erlang processes.

If the messages in a queue take up a lot of memory, we recommend lazy queues so that they are stored on disk as soon as possible and not kept in memory longer than is necessary.

Getting Help and Providing Feedback

If you have questions about the contents of this guide or any other topic related to RabbitMQ, don't hesitate to ask them on the RabbitMQ mailing list.

Documentation feedback is also very welcome on the list. If you'd like to contribute an improvement to the site, its source is available on GitHub.