Menu

Exchange to Exchange bindings

Arriving in RabbitMQ 2.1.1, is support for bindings between exchanges. This is an extension of the AMQP specification and making use of this feature will (currently) result in your application only functioning with RabbitMQ, and not the myriad of other AMQP 0-9-1 broker implementations out there. However, this extension brings a massive increase to the expressivity and flexibility of routing topologies, and solves some scalability issues at the same time.

Normal bindings allow exchanges to be bound to queues: messages published to an exchange will, provided the various criteria  of the exchange and its bindings are met, pass through the various bindings and be appended to the queue at the end of each binding. That's fine for a lot of use cases, but there's very little flexibility there: it's always just one hop -- the message being published to one exchange, with one set of bindings, and consequently one possible set of destinations. If you need something more flexible then you'd have to resort to publishing the same message multiple times. With exchange-to-exchange bindings, a message published once, can flow through any number of exchanges, with different types, and vastly more sophisticated routing topologies than previously possible.

Example: Logging

Imagine a generic logging scenario: you want to "tap" into various parts of your message flows within RabbitMQ to check on the stream of messages that is flowing through that particular exchange. You can't do this to a queue, so the most obvious solution is to add a fresh queue which is going to be your logging queue, and to bind it to the exchange you're interested in. Now depending on the type of exchange and your binding key, you may receive none, some or all of the messages going through that exchange. This could be represented by the following diagram:

However, what about if you have multiple logging queues -- you might have one for syslog, one for the console, one for some third party management software. It would be much simpler to be able to treat all of these as single entity: thus requiring the addition of one binding, as above, to wire all of these to the same source exchange. With exchange-to-exchange bindings, you can now do this:

Now, we have our existing logging exchange with a couple of queues receiving all messages from it, and we just need to add one new binding (the one in RabbitMQ-orange) between the exchange we're interested in, and our logging exchange. Whilst both exchanges here are fanout, there's no need for this to be the case: we might have different logging queues which are interested in only subsets of the messages that are flowing through the logging exchange. So that exchange could well be a topic exchange:

So now we have that syslog is only going to receive errors (i.e. messages with a routing key prefixed by "error."), whilst the console receives all messages. But in both cases, this behaviour applies regardless of the source of the messages: the logging exchange can be bound to zero, one, or many exchanges as necessary:

Usage

The existing queue.bind AMQP method suggests, by its naming, that the action you're performing is to bind a queue to an exchange. This is slightly confusing because the messages are actually flowing from the exchange, through the binding, and to the queue. However, the easy part is that the method has a field for the queue name, the exchange name, and the binding key.

We have introduced exchange.bind and exchange.unbind AMQP methods. Sadly, because both of the end-points of such bindings are exchanges, and we can't have two fields both called exchange, we've had to come up with a different naming scheme. We've chosen here to reflect the flow of messages. Thus the field source indicates the name of the exchange from which messages enter the binding, and the field destination indicates the name of the exchange to which messages are passed.

We have added support for exchange-to-exchange bindings in our Java, .Net and Erlang clients. We hope other community-contributed clients will add support soon.

The binding created by exchange.bind is semantically identical to queue.bind bindings: unidirectional, binding keys and exchange types operate as normal, but both endpoints (the source and destination) of the binding are exchanges.

Just like with queue.bind, multiple distinct bindings can be created between the same binding-endpoints. We detect and eliminate cycles during message delivery, and ensure that transitively, over any routing topology, for every queue to which a given message is routed, each queue will receive exactly one copy of that message. Exchanges which are declared as auto-delete will still be removed when all the bindings for which that exchange is the source are removed, regardless of whether the destination of those bindings are to queues or exchanges. Note that an auto-delete exchange will only be deleted when all bindings for which the exchange is the source are removed: if you add exchange-to-exchange bindings for which the given exchange is the destination then that exchange will not be auto-deleted on removal of those bindings. This mirrors the fact that an auto-delete queue is not deleted when bindings to that queue are removed.

Example 2: Presence

Imagine a chat system. Every user is going to have a queue, which holds all of the messages sent to that user. That queue also should be sent presence notifications: the events that indicate whether the person's friend is coming online or going offline.

Our imaginary person is called John. When John comes online, he's going to publish a message to the exchange presence saying that he's online and available for chat. The presence exchange will thus be a direct exchange, and John will publish his presence to that exchange with a routing key of "John". Thus all of John's friends need to be subscribed to the presence exchange (i.e. they need to have a binding to their own queue from that exchange), with a binding key of "John". When logging in, John himself needs to bind his queue to the presence exchange, with one binding per friend: each binding carrying a different binding key (e.g. a binding with key of Alice, Bill etc). The overall system (just for presence) might look a bit like this:

Here we see John is friends with Alice and Bill (thus he binds to his queue from the presence exchange with routing keys of Alice and Bill). Alice and Bill are not friends with each other, but each of them have several other friends, each including John.

Thus when each person comes online, they must create their queue, and they must create bindings to that queue for each of their friends. In a large chat system, the average number of friends might be about 20, and there may be hundreds if not thousands of people coming online or going offline every minute. At this point, the churn rate in the bindings may become a severe performance bottleneck. With exchange-to-exchange bindings, this problem can be solved. The trick is to allow the friendship relations to be expressed solely by exchange-to-exchange bindings, which can be left in place even when users go offline. When the user comes online, they then need only to create their queue and a single binding:

As usual, messages that are routed to exchanges with no bindings just vanish, so there is no buffering going on if John is offline, so John_queue doesn't exist: the exchange John discards all the messages sent to it in this case. Thus as a result, the exchange-to-exchange binding mesh only needs modifying when people add friends or remove friends, and the loading induced by friends coming online or going offline is vastly reduced.

And this is merely the start: far more complex routing topologies are now possible with exchange-to-exchange bindings...

Tags: , , , ,

22 Responses to “Exchange to Exchange bindings”

  1. Twitter Trackbacks for RabbitMQ » Blog Archive » Exchange to Exchange bindings - Messaging that just works [rabbitmq.com] on Topsy.com Says:

    [...] RabbitMQ » Blog Archive » Exchange to Exchange bindings - Messaging that just works rabbitmq.com/blog/2010/10/19/exchange-to-exchange-bindings/ – view page – cached RabbitMQ is a complete and highly reliable enterprise messaging system based on the emerging AMQP standard Tweets about this link [...]

  2. Ben Pirt Says:

    Fantastic - I've been wanting this for a long time!

    Nice work guys. Any chance it will make it into the AMQP spec?

  3. Matthew Sackman Says:

    @Ben
    Glad you like it. Apparently it's fairly easy to get things into the spec: we just need two AMQP broker implementations which implement the same feature, and then it becomes possible for it to be added to the spec - maybe in the form of a 0-9-2 AMQP spec or something like that.

    Given its utility, it's not hard to imagine other implementations adding this. It would be great if they did so in a compatible way.

  4. Michael Says:

    @Matthew @Ben AMQP's specification process doesn't quite work like that, sadly. But the good thing about extensions of an open protocol is that any client, and any broker, can freely implement them. So we might hope to see it appear elsewhere.

  5. Matthew Sackman Says:

    @Michael: apologies - I'd clearly misinterpreted something I'd heard.

  6. Tony Garnock-Jones Says:

    When there's an E2E binding, say routing messages from exchange A into exchange B, and I receive a message from queue Q that's bound to exchange B, what will be the value of the Basic.Deliver's "exchange" field?

  7. Matthew Sackman Says:

    @Tony: it just keeps the name of the exchange to which the message was published. It is one message. The message includes the properties of publication. These properties can not change if the message goes through multiple exchanges -- otherwise that would make it into multiple messages.

  8. Jeff Laughlin Says:

    I think AMQP 1.0 permits this sort of thing as an upshot of replacing exchanges and queues with nodes and links. Somebody please correct me if I am wrong.

  9. Michael Says:

    @Jeff

    AMQP 1.0 does not replace exchanges and queues with nodes and links, though it is tempting to think so from the treatment given in previous revisions.

    AMQP 1.0 deals only with transporting messages from A to B, and not with what happens /at/ A or B.

  10. Michael Says:

    @Matthew @Tony "exchange" is a field in basic.publish and basic.deliver, not part of the message. So there's no sense in which it can be involved in a mutation of the message.

  11. Matthew Sackman Says:

    @Michael, I don't agree with that. AFAIC, the fields in basic.publish and basic.deliver, especially where they intersect, are indeed part of the message.

  12. Michael Says:

    @Matthew Interesting position; so delivery-tag and consumer-tag are part of the message too -- that would mean messages are duplicated when delivered to more than one consumer, wouldn't it?

    (If so, why the problem with "changing" the exchange field, again ..?)

  13. Matthew Sackman Says:

    @Michael: The issue, for me, is about loss of information. Really, I think the message structure is inadequate - I'd like for it to contain the entire path the message takes. However, that can rapidly get complex e.g. for security reasons: it might be important that a consumer should not be able to figure out where a message came from. The fact that the consumer is always told the delivery-tag and consumer-tag are justifiable on the grounds that without those, the consumer can't process the message correctly.

    In the absence of the message being able to, on delivery, contain its entire delivery path, prioritising information in the message which was provided by the publishing client (i.e. the exchange to which the message was published) over (partial) information which is an artifact of the routing topology of the broker (i.e. the last of (possibly) many exchanges through which the message passed), seems sensible.

    There are some properties that we already have on messages which are per-queue properties (i.e. they can vary within the same message, should the message end up in multiple queues). In some ways it's the same message. In other ways, they're copies of the same message...

  14. Tony Garnock-Jones Says:

    @Matthew: A good way to get a handle on this stuff is to think about properties of the envelope (in the AMQP case, basic.deliver) as distinct from properties of the message. As always, SMTP has been here before and has a reasonable first stab at the problem which is worth having a read of. Some of the specific decisions it makes are specific to the email context, and so making the same decisions in an AMQP context would be a mistake, but many of them are sensible. And similarly, in AMQP some of the fields in the basic.publish/basic.deliver really are part of the message rather than the envelope, so it can be difficult to see where the split should go. (And in some cases there are no good choices.)

    For example, in SMTP there is an envelope-from and envelope-to that are distinct from the from and to in the message header. The "received-by" header is a hop-by-hop header that is a part of the envelope, not the message. "subject", "body" and "date" are parts of the message, not the envelope. Etc.

    Ultimately the AMQP 0-8/0-9 model is pretty broken once you start extending it to things like E2E binding, so warts like the treatment of "exchange" in a basic.deliver resulting from an E2E binding will start to appear more and more frequently. TBH I don't know if AMQP 1.0-draft is any better.

  15. Matthew Sackman Says:

    @Tony: We certainly agree that AMQP is pretty broken wrt various headers once you start adding extension points such as E2E. I think we're doing the least bad option, but sure, it's not ideal.

    SMTP is, IMO, probably just as bad a mess. You seem to not make a distinction between SMTP (2821 or later) and Internet Mail Format (5322). In theory, there should be no reason for the MTA to inspect the payload of the message (i.e. anything after DATA in the SMTP session). In theory, the payload should be able to be random opaque data: the SMTP session should convey everything that is necessary for the relaying of the message (and, content filtering aside, it does). The issues start to arise when the MTA wants to add its own Received header (amongst others) to the payload, at which point, the strict separation between the SMTP session and the payload gets rather blurred. Thus really, the only way in which the payload can be entirely opaque is if you start using MIME, at which point, the differences between the SMTP session and the headers of the email are weird and bizarre. In particular, the fact some elements of the SMTP session are repeated in the email is just strange.

    The numerous ways in which MUAs cock this up further does not help matters!

  16. Tony Garnock-Jones Says:

    @Matthew: Exactly, I meant to include 2822 (5322 now I guess, thanks for the pointer). While the fields are syntactically intertwined, there's still a clear ownership. One of the few valuable things to come out of the AMQP 0-10 work was the slogan "don't confuse ownership with position", i.e. a field ("received-by", for instance) can be logically owned by one layer but physically placed anywhere you choose in a message on the wire. In the case of SMTP, some of the envelope/transport fields are stuck in the headers of the message, which is pretty awful, but as the revisions of the standards go by it seems like the logical ownership story is starting to get clarified.

    Regarding your point that "some elements of the SMTP session are repeated in the email", if you're talking about the "received-by" headers then I agree, but if you're talking about the addressing information (To:, From:, MAIL From:, RCPT To:), then there are clear semantic differences analogous to the differences between the headers section of a paper letter and the name and address on the outside of an envelope. The most obvious of which is that the information on the outside of the envelope is intended for the transport agent to use in routing, where the information on the paper inside the envelope is intended for the recipient to read and use in whatever they are doing.

  17. Tony Garnock-Jones Says:

    (Incidentally, the embedded SVG+XML looks great on a desktop browser but doesn't render on my Android phone. If there were a PNG fallback of some kind that'd be useful.)

  18. Matthew Sackman Says:

    @Tony: yay - you're the first one to notice they're SVGs. The webkit in iPhone does cope with SVGs, sadly, Palm and Google strip out SVG support. There are various noises about making a tiny implementation as part of a Google SoC project.

  19. Stephen Graham Says:

    Are E2E bindings limited to Exchanges that are on the same Vhost? Same RabbitMQ server? Or can I do E2E exchanges across servers to implement some sort of wild and crazy distributed Disaster Recovery kind of scenario?

  20. Matthew Sackman Says:

    @Stephen

    Are E2E bindings limited to Exchanges that are on the same Vhost?

    Yes. VHosts are an inviolable namespace.

    Same RabbitMQ server?

    Logical server, yes (i.e. it's fine within a cluster).

    Or can I do E2E exchanges across server to implement some sort of wild and crazy distributed Disaster Recovery kind of scenario?

    No - clustering is not a good idea over WAN links.

  21. Glenn Babecki Says:

    What are the constraints on the types of exchanges that can be bound together? It would seem that other than using a fanout exchange in the mix, all "routable" exchanges must be of the same type in order to process the routing/binding key supplied by the publisher originating the message. It this a correct interpretation?

  22. Matthew Sackman Says:

    @Glenn

    There are no constraints on the types of exchanges that can be bound together at all. It is simply the case that each exchange will process the routing key of a message against its outbound bindings in the normal way.