Menu

Feature Flags

Overview

In a mixed version cluster (e.g. some versions are 3.7.x and some are 3.8.x) some nodes will support a different set of features, behave differently in certain scenarios, and otherwise not act exactly the same: they are different versions after all.

Feature flags are a mechanism that controls what features are considered to be enabled or available on all cluster nodes. If a feature flag is enabled, so is its associated feature (or behavior). If not then all nodes in the cluster will disable the feature (behavior).

The feature flag subsystem allows RabbitMQ nodes with different versions to determine if they are compatible and then communicate together, regardless of their version.

This subsystem was introduced in RabbitMQ 3.8.0 to allow rolling upgrades of cluster members without shutting down the entire cluster.

This subsystem does not guarantee that all future changes in RabbitMQ can be implemented as feature flags and entirely backwards compatible with older release series. Therefore, a future version of RabbitMQ might still require a cluster-wide shutdown for upgrading. Please always read release notes to see if a rolling upgrade to the next minor or major RabbitMQ version is possible.

Quick summary (TL;DR)

The Two Main Rules

  • A feature flag can be enabled only if all nodes in the cluster support it.
  • A node can join or re-join a cluster only if:
    1. it supports all feature flags enabled in the cluster and
    2. if the cluster supports all the feature flags enabled on that node.

RabbitMQ 3.7.x and 3.8.x nodes are compatible as long as no 3.8.x feature flags are enabled.

The Two Main Commands

  • To list feature flags:
    rabbitmqctl list_feature_flags
  • To enable a feature flag:
    rabbitmqctl enable_feature_flag <name>

It is also possible to list and enable feature flags from the Management plugin UI, in "Admin > Feature flags".

The Two Examples

Example 1: Compatible Nodes

  • If nodes A and B are not clustered, they can be clustered.
  • If nodes A and B are clustered:
    • "Coffee maker" can be enabled.
    • "Juicer machine" cannot be enabled because it is unsupported by node B.

Example 2: Incompatible Nodes

  • If nodes A and B are not clustered, they cannot be clustered because "Juicer machine" is unsupported on node B.
  • If nodes A and B are clustered and "Juicer machine" was enabled while node B was stopped, node B cannot re-join the cluster on restart.

Feature Flags and RabbitMQ Versions

As covered earlier, the feature flags subsystem's primary goal is to allow upgrades regardless of the version of RabbitMQ, if possible.

Therefore, as of RabbitMQ 3.8.0, it will be possible to upgrade to the next patch, minor or major release, except if it is stated otherwise in the release notes. Indeed, there are some changes which cannot be implemented as feature flags.

It is also possible to upgrade from RabbitMQ 3.7.x to 3.8.x. Indeed, RabbitMQ 3.7.x does not have the feature flags subsystem and RabbitMQ 3.8.x considers that a 3.7.x node has an empty list of feature flags. Therefore, as long as the 3.8.x node has all its feature flags disabled, it is compatible with a 3.7.x node.

However, note that only upgrading from one minor to the next minor or major is supported. To upgrade from e.g. 3.8.5 to 3.10.0, it is necessary to upgrade to 3.9.x first. Likewise if there is one or more minor release branches between the minor version used and the next major release. That might work (i.e. there could be no incompatible changes between major releases), but this scenario is unsupported by design for the following reasons:

  • Skipping minor versions is not tested in CI.
  • Non-sequential releases may or may not support the same set of feature flags. Support for older feature flags can be removed. Flag present for several minor branches, they are removed and their associated feature/behavior is now implicitly enabled by default, preventing clustering with older nodes. Feature flags are kept around for a number (say, two) of minor releases to allow for a transition period.

The deprecation/removal policy of feature flags is yet to be defined.

How to List Supported Feature Flags

When a node starts for the first time, all supported feature flags are enabled by default. When a node is upgraded to a newer version of RabbitMQ, new feature flags are enabled by default if it is a single isolated node, or remain disabled by default if it belongs to a cluster.

To list the feature flags, use rabbitmqctl list_feature_flags:

rabbitmqctl list_feature_flags

# => Listing feature flags ...
# => name   state
# => empty_basic_get_metric enabled
# => implicit_default_bindings  enabled
# => quorum_queue   enabled

For improved table readability, switch to the pretty_table formatter:

rabbitmqctl -q --formatter pretty_table list_feature_flags \
  name state provided_by desc doc_url

which would produce a table that looks like this:

┌───────────────────────────┬─────────┬───────────────────────────┬───────┬────────────┐
│ name                      │ state   │ provided_by               │ desc  │ doc_url    │
├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤
│ empty_basic_get_metric    │ enabled │ rabbitmq_management_agent │ (...) │            │
├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤
│ implicit_default_bindings │ enabled │ rabbit                    │ (...) │            │
├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤
│ quorum_queue              │ enabled │ rabbit                    │ (...) │ http://... │
└───────────────────────────┴─────────┴───────────────────────────┴───────┴────────────┘

As shown in the example above, the list_feature_flags command accepts a list of columns to display. The available columns are:

  • name: the name of the feature flag.
  • state: enabled or disabled if the feature flag is enabled or disabled, unsupported if one or more nodes in the cluster do not know this feature flag (and therefore it cannot be enabled).
  • provided_by: the RabbitMQ component or plugin which provides the feature flag.
  • desc: the description of the feature flag.
  • doc_url: the URL to a webpage to learn more about the feature flag.
  • stability: indicates if the feature flag is stable or experimental.

How to Enable Feature Flags

After upgrading one node or the entire cluster, it will be possible to enable new feature flags. Note that it will be impossible to roll back the version or add a cluster member using the old version once new feature flags are enabled.

To enable a feature flag, use rabbitmqctl enable_feature_flag:

rabbitmqctl enable_feature_flag <name>

The list_feature_flags command can be used again to verify the feature flags' states. Assuming all feature flags were disabled initially, here is the state after enabling the quorum_queue feature flag:

rabbitmqctl -q --formatter pretty_table list_feature_flags

┌───────────────────────────┬──────────┐
│ name                      │ state    │
├───────────────────────────┼──────────┤
│ empty_basic_get_metric    │ disabled │
├───────────────────────────┼──────────┤
│ implicit_default_bindings │ disabled │
├───────────────────────────┼──────────┤
│ quorum_queue              │ enabled  │
└───────────────────────────┴──────────┘

It is also possible to list and enable feature flags from the Management plugin UI, in "Admin > Feature flags":

How to Disable Feature Flags

It is impossible to disable a feature flag once it is enabled.

List of Core Feature Flags

The feature flags listed below are those provided by RabbitMQ core or one of the tier-1 plugins bundled with RabbitMQ.

Feature flag name Description Lifecycle
empty_basic_get_metric Count AMQP 0-9-1 basic.get issued on empty queues in statistics.
Introduction:3.8.0
Removal:-
implicit_default_bindings Clean up explicit default bindings now that they are managed implicitly.
Introduction:3.8.0
Removal:-
quorum_queue Add the quorum queue type.
Introduction:3.8.0
Removal:-

How Do Feature Flags Work?

From an Operator Point of View

Node and Version Compatibility

There are two times when an operator has to consider feature flags:

  • When extending an existing cluster by adding nodes using a different version of RabbitMQ (older or newer), the operator needs to pay attention to feature flags: they might prevent clustering.
  • After upgrading a cluster, the operator should take a look at the new feature flags and perhaps enable them.

A node compares its own list of feature flags with remote nodes' list of feature flags to determine if it can join a cluster. The rules are defined as:

  • All feature flags enabled locally must be supported remotely.
  • All feature flags enabled remotely must be supported locally.

It is important to understand the difference between enabled and supported:

  • A supported feature flag is one which is known by the node. It can be enabled or disabled, but its state is irrelevant at this point.
  • An enabled feature flag is one which is activated and used by the node. Per the definition above, it is implicitely a supported feature flag.

If one of those two conditions is not verified, the node cannot join or re-join the cluster.

However, if it can join the cluster, the state of enabled feature flags is synchronized between nodes: if a feature flag is enabled on one node, it is enabled on all other nodes.

Scope of the Feature Flags

The feature flags subsystem covers inter-node communication only. This means the following scenarios are not covered and may not work as initially expected.

Using rabbitmqctl on a remote node

Controlling a remote node with rabbitmqctl is only supported if the remote node is running the same version of RabbitMQ than rabbitmqctl comes from.

If CLI tools from a different minor/major version of RabbitMQ is used on a remote node, they may fail to work as expected or even have unexpected side effects on the node.

Load-balancing Requests to the HTTP API

If a request sent to the HTTP API exposed by the Management plugin goes through a load balancer, including one from the management plugin UI, the API's behavior and its response may be different, depending on the version of the node which handled the request. This is exactly the same if the domain name of the HTTP API resolves to multiple IP addresses.

This situation may happen during a rolling upgrade if the management UI is open in a browser with periodic automatic refresh.

For example, if the management UI was loaded from a RabbitMQ 3.7.x node but it then queries a RabbitMQ 3.8.x node, the JavaScript code running in the browser may fail with exceptions due to HTTP API changes.

What Happens When a Feature Flag is Enabled

When a feature flag is enabled with rabbitmqctl, here is what happens internally:

  1. RabbitMQ verifies if the feature flag is already enabled. If yes, it stops.
  2. It verifies if the feature flag is supported. If no, it stops.
  3. It marks the feature flag state as state_changing. This is an internal transitional state to inform consumers of this feature flag. Most of the time, it means that components depending on this particular feature flag will be blocked until the state changes to enabled or disabled.
  4. It enables all feature flags this one depends on. Therefore for each one of them, we go through this same procedure.
  5. It executes the migration function, if there is one. This function is responsible for preparing or converting various resources, such as changing the schema of a database.
  6. If all the steps above succeed, the feature flag state becomes enabled. Otherwise, it is reverted back to disabled.

As an operator, the most important part of this procedure to remember is that if the migration takes time, some components and thus some operations in RabbitMQ might be blocked during the migration.

From a Developer Point of View

When working on a plugin or a RabbitMQ core contribution, feature flags should be used to made the new version of the code compatible with older versions of RabbitMQ.

When to Use a Feature Flag

It is developer's responsibility to look at the list of existing and future (i.e. those added to the master branch) feature flags and see if the new code can be adapted to take advantage of them.

Here is an example. When developing a plugin which used to use the #amqqueue{} record defined in rabbit_common/include/rabbit.hrl, the plugin has to be adapted to use the new amqqueue API which hides the previous record (which is private now). However, there is no need to query feature flags for that: the plugin will be ABI-compatible (i.e. no need to recompile it) with RabbitMQ 3.8.0 and later. It should also be ABI-compatible with RabbitMQ 3.7.x once the amqqueue appears in that branch.

However if the plugin targets quorum queues introduced in RabbitMQ 3.8.0, it may have to query feature flags to determine what it can do. For instance, can it declare a quorum queue? Can it even expect the new fields added to amqqueue as part of the quorum queues implementation?

If the plugin carefully checks feature flags to avoid any incorrect expectations, it will be compatible with many versions of RabbitMQ: the user will not have to recompile anything or download another version-specific copy of the plugin.

When to Declare a Feature Flag

If a plugin or core broker change modifies one of the following aspects:

  • record definitions
  • replicated database schemas
  • the format of Erlang messages passed between nodes
  • modules and functions called from remote nodes

Then compatibility with older versions of RabbitMQ becomes a concern. This is where a new feature flag can help ensure a smoother upgrade experience.

The two most important parts of a feature flag are:

  • the declaration as a module attribute
  • the migration function

The declaration is a module attribute which looks like this:

-rabbit_feature_flag(
   {quorum_queue,
    #{desc          => "Support queues of type quorum",
      doc_url       => "http://www.rabbitmq.com/quorum-queues.html",
      stability     => stable,
      migration_fun => {?MODULE, quorum_queue_migration}
     }}).

The migration function is a stateless function which looks like this:

quorum_queue_migration(FeatureName, _FeatureProps, enable) ->
    Tables = ?quorum_queue_tables,
    rabbit_table:wait(Tables),
    Fields = amqqueue:fields(amqqueue_v2),
    migrate_to_amqqueue_with_type(FeatureName, Tables, Fields);
quorum_queue_migration(_FeatureName, _FeatureProps, is_enabled) ->
    Tables = ?quorum_queue_tables,
    rabbit_table:wait(Tables),
    Fields = amqqueue:fields(amqqueue_v2),
    mnesia:table_info(rabbit_queue, attributes) =:= Fields andalso
    mnesia:table_info(rabbit_durable_queue, attributes) =:= Fields.

More implementation docs can be found in the rabbit_feature_flags module source code.

Erlang's edoc reference can be generated locally from a RabbitMQ repository clone or source archive:

gmake edoc
# =>  ... Ignore warnings and errors...

# Now open `doc/rabbit_feature_flags.html` in the browser.

How to Adapt and Run Testsuites with mixed-version clusters

When a feature or behavior depends on a feature flag (either in the core broker or in a plugin), the associated testsuites must be adapted to take this feature flag into account. It means that before running the actual testcase, the setup code must verify if the feature flag is supported and either enable it if it is, or skip the testcase. This is the same for setup code running at the group or suite level.

There are helper functions in rabbitmq-ct-heleprs to ease that check. Here is an example, taken from the dynamic_qq_SUITE.erl testsuite in rabbitmq-server:

init_per_testcase(Testcase, Config) ->
    % (...)

    % 1.
    % The broker or cluster is started: we rely on this to query feature
    % flags.
    Config1 = rabbit_ct_helpers:run_steps(
                Config,
                rabbit_ct_broker_helpers:setup_steps() ++
                rabbit_ct_client_helpers:setup_steps()),

    % 2.
    % We try to enable the `quorum_queue` feature flag. The helper is
    % responsible for checking if the feature flag is supported and
    % enabling it.
    case rabbit_ct_broker_helpers:enable_feature_flag(Config1, quorum_queue) of
        ok ->
            % The feature flag is enabled at this point. The setup can
            % continue to play with `Config1` and the cluster.
            Config1;
        Skip ->
            % The feature flag is unavailable/unsupported. The setup
            % calls `end_per_testcase()` to stop the node/cluster and
            % skips the testcase.
            end_per_testcase(Testcase, Config1),
            Skip
    end.

It it possible to run testsuites locally in the context of a mixed-version cluster. If configured to do so, rabbitmq-ct-helpers will use a second version of RabbitMQ to start half of the nodes when starting a cluster:

  • Node 1 will be on the primary copy (the one used to start the testsuite)
  • Node 2 will be on the secondary copy (the one provided explicitely to rabbitmq-ct-helpers)
  • Node 3 will be on the primary copy
  • Node 4 will be on the secondary copy
  • ...

To run a testsuite in the context of a mixed-version cluster:

  1. Clone the rabbitmq-public-umbrella repository and checkout the appropriate branch or tag. This will be the secondary Umbrella. In this example, the v3.7.x branch is used:

    git clone https://github.com/rabbitmq/rabbitmq-public-umbrella.git secondary-umbrella
    cd secondary-umbrella
    git checkout v3.7.x
    make co
    

    Currently, when using the v3.7.x branch, deps/rabbit_common and deps/rabbit must use the v3.7.x-versions-compatibility branch.

  2. Compile RabbitMQ or the plugin being tested in the secondary Umbrella. The rabbitmq-federation plugin is used as an example:

    cd seconary-umbrella/deps/rabbitmq_federation
    make dist
    

  3. Go to RabbitMQ or the same plugin in the primary copy:

    cd /path/to/primary/rabbitmq_federation
    

  4. Run the testsuite. Here, two environment variables are specified to configure the "mixed-version cluster" mode:

    SECONDARY_UMBRELLA=/path/to/secondary-umbrella \
    RABBITMQ_FEATURE_FLAGS= \
    make tests
    

    The first environment variable, SECONDARY_UMBRELLA, tells rabbitmq-ct-helpers where to find the secondary Umbrella, as the name suggests. This is how the mixed-version cluster mode is enabled.

    The secondary environment variable, RABBITMQ_FEATURE_FLAGS, is set to the empty string and tells RabbitMQ to start with all feature flags disabled: this is mandatory to have a newer node compatible with an older one.

Getting Help and Providing Feedback

If you have questions about the contents of this guide or any other topic related to RabbitMQ, don't hesitate to ask them on the RabbitMQ mailing list.

Help Us Improve the Docs <3

If you'd like to contribute an improvement to the site, its source is available on GitHub. Simply fork the repository and submit a pull request. Thank you!