Menu

Archive for September, 2010

Broker vs Brokerless

Wednesday, September 22nd, 2010

The RabbitMQ team has been working with Martin Sustrik to provide code and documentation for using RabbitMQ and ZeroMQ together.  Why is this a good idea?  Because the broker and brokerless approaches are complementary.  We'll be posting more about this as the codebase evolves.  This post is introductory and can be seen as commentary on Ilya Grigorik's excellent introduction to ZeroMQ and the InfoQ summary of Ilya's article.

I like ZeroMQ and think it is useful - of which more below.  But I have seen some brash claims made on its behalf.  This can lead to confusion.

So what is the 'brokerless' model?  In the comments to Ilya's and the InfoQ post, ZeroMQ is compared to SCTP and to JGroups.  These are important technologies and form a helpful starting point for thinking about brokerless messaging patterns.  Let's look at what you might need if you combine messaging (like SCTP) with pubsub groups (like JGroups) to make arbitrary networks using 'brokerless' peers.

Some things you might need in a brokerless network

If you set up a brokerless messaging network, three things that you might need are: discovery, availability and management.

Discovery is the problem of maintaining a roster of peers that a system can send messages to, and who can join this roster.

Availability is the problem of dealing with peers disappearing from time to time.  For example if you have 50 subscribers to a feed, and only 40 of them are available to receive updates, should you keep a copy of their messages until they reappear?  That could mean "for a very long time".   And if you do keep messages and lists of "who has seen what", then where is it best to do this?

This is also a problem when message receivers do not respond quickly.  To quote from Martin Sustrik of ZeroMQ, "You can never differentiate between 'network error and 'no response received'. TCP in no better. You'll have accept that or keep with a single box."

Management is an interesting area for analysis too.  ZeroMQ's model aligns messaging closely with sockets.  This means that, like in TCP, 'any' communication network can be implemented in such a way that it provides some messaging capability.  But, networks can be arbitrarily complex.  For example unless you don't care about it (and you may not) management of "who is connected to who, and who can be connected to who" can get complicated.   This kind of management problem gets more difficult the more you scale.  Models like JGroups usually make this problem go away by making a simplifying assumption, i.e.: everyone in the group talks to everyone else in the group.  Easy :-)

I am not suggesting that you always need these things.  The ZeroMQ philosophy is to home right in on networking, and this creates focus.  But if you do need them then you might end up implementing them yourself.  Enter the broker...

How can a broker help to solve these problems

Brokers can provide solutions for discovery, availability and management.  They can also form reliable networks, e.g. for email delivery and instant messaging services.

First: what is a 'broker'?  It is both a leader, and an intermediary.

A broker is a leader.  In distributed computing, the problems of management, discovery and availability are typically solved by electing a leader among the set of distributed components.  In the world of "messaging", such a leader is usually known as a "broker".  Stating that in order to be a leader, you need to be a broker, makes it much easier to work out who is the leader, than in a completely brokerless system in which "anyone can lead, but nobody knows how".

A broker is also an intermediary.  For example, instead of having to connect everyone in the group directly, communicators simply connect to the broker (or brokers).  A broker may also be used to solve availability problems such as "offline consumer", by providing persistence and managing recovery on behalf of systems that cannot do it themselves.

Thus, brokers simplify network design by making reasonable assumptions.  Of course, when those assumptions don't hold, you may not want a broker.

Brokers are not 'centralized'

A commonly held misconception about brokers is that they are 'centralized'.  Brokers are NOT necessarily a 'centralized' solution.  Intermediaries can be decentralized.  You can have multiple brokers in a single network in order to increase throughput and availability.  Sometimes these networks of servers are called federations.  Note that individual brokers do not need to be 'highly available' in order to have a redundant network of servers.

This is, for example, how email (SMTP) and XMPP networks work.  Both email and instant messaging are brokered models, and both use multiple brokers in a simple and redundant way.  For example, mail transfer agents provide a delivery and routing network for email.  It would be difficult to come up with a design for this that was completely peer to peer, without reinventing 'special peers' - also known as brokers.

So what model is simplest?

Peer to peer models are not inherently more or less simple than brokered models.  If you do not need discovery, availability, management, or intermediation then it may be simpler to not use them.  But if you need them, it may be simpler to not implement them yourself.

Networks of servers (brokers) are not more or less redundant or decentralized than networks of clients (peers).  Both the broker and brokerless model have their pros and cons in terms of reliability, and other considerations eg latency.

The two models solve different problems.

For example, RabbitMQ and ZeroMQ are complementary.  From a RabbitMQ point of view ZeroMQ is a 'smart client' that can use its buffers like a queue.  That's useful in some cases.  From a ZeroMQ point of view, RabbitMQ is a network device that provides services that you would not necessarily want to have to implement yourself.

We want our customers and users to always have the best toolset available which is why we have provided the Github repo for you to play with.  Thanks again to Martin Sustrik for his work on this.

Watch this space for more on this interesting area of work and discussion.

RabbitMQ on github

Monday, September 20th, 2010

We've received quite a few requests recently for us to put the RabbitMQ code on github.

RabbitMQ is open source, and the Mercurial repositories where we work on the code are publicly accessible. But github is rapidly establishing itself as the Facebook of open-source development: It makes it easy to follow projects and participate in their development, all within a slick web-based UI.

So from today, we are mirroring our repositories to github. You can find them at http://github.com/rabbitmq. The repositories on github track our Mercurial repositories with a delay of a few minutes.

The main development of RabbitMQ will continue to take place on Mercurial. Converting our development workflow and infrastructure to git would take a lot of effort that we'd prefer to spend improving RabbitMQ. And besides, members of the team differ in their opinions about the relative merits of hg and git.

If you wish to contribute to RabbitMQ, we are happy to receive changes via github, or Mercurial hosting sites such as bitbucket, or even as old-fashioned patches!

Very fast and scalable topic routing – part 1

Tuesday, September 14th, 2010

Among other things, lately we have been preoccupied with improving RabbitMQ's routing performance. In particular we have looked into speeding up topic exchanges by using a few well-known algorithms as well as some other tricks. We were able to reach solutions many times faster than our current implementation.

First, a little about the problem we are trying to solve. Here is a quote from the AMQP 0-9-1 spec:

The topic exchange type works as follows: 01d39485b74c9185569f7f9540cf3eac The routing key used for a topic exchange MUST consist of zero or more words delimited by dots. Each word may contain the letters A-Z and a-z and digits 0-9. The routing pattern follows the same rules as the routing key with the addition that * matches a single word, and # matches zero or more words. Thus the routing pattern *.stock.# matches the routing keys usd.stock and eur.stock.db but not stock.nasdaq.

Our goal is to match messages (routing keys) against bindings (patterns) in a fast and scalable manner.

Here is a list of approaches that we tried out:

  • 1. Caching messages' topics on a per-word basis. This is what the AMQP spec suggests and there are some studies on this already.
  • 2. Indexing patterns on a per-word basis. This is similar with 1, except we prepare the patterns beforehand, rather than preparing for topic keys that have been previously sent.
  • 3. Trie implementation. Arrange the words in the patterns in a trie structure and follow a route down the trie to see if a particular topic matches.
  • 4. A deterministic finite automate (DFA) implementation. This is a well-known approach for string matching, in general.

Each of these approaches have pros and cons. We generally aimed for:

  • good complexity in both space and time, to make it scalable
  • ease of implementation
  • good performance for the commonly used situations
  • good worst-case performance
  • making it quick in the simple cases (where scalability in number of bindings is not a concern)

From the start, we were able to beat the current implementation by a factor of 3 times (in all cases) just by being more careful when splitting the keys into words (not repeating splitting both the pattern and the topic for each pattern, every time).

We found approaches 1 and 2 to be particularly unfit for the needs. They were the slowest, they do not have a good complexity, because they involve intersecting sets for each level, and they can not be adapted to include functionality for "#". Thus, we concentrated our attention on approaches 3 and 4.

The trie

Here is an example of a trie structure, if we were to add patterns "a.b.c", "a.*.b.c", "a.#.c", "b.b.c":

In order to match a pattern (say for example "a.d.d.d.c"), we start at root and follow the topic string down the tree word by word. We can go deeper either through an exact match, a "*" or a "#". In the case of the "#" we can go deeper with all the versions of the tail of the topic. For our example, we would go through "#" with "d.d.d.c", "d.d.c", "d.c", "c" and "".

The trie implementation has a number of advantages: good size complexity; adding a new binding is cheap; and it is the easiest to implement; but, also the disadvantage that it backtracks for "*" and "#", in order to find all possible matches.

The DFA

This approach is based on constructing an NFA that accepts the patterns of the bindings, and from it constructing the equivalent DFA and using it instead. Since we are also interested in which pattern matches and not only if it matches or not, we cannot merge the tails of the patterns in the NFA.

To construct the DFA, we modeled the behaviour of "#" like this:

For example, the patterns "a.b.c", "a.*.b.c", "a.#.c", "b.b.c" would be represented in an NFA like this:

The nodes 11, 4, 6 and 8 would have information attached to them which would point to the respective bindings.

In order to convert the NFA to a DFA, we tried various approaches and went as far as generating source code for the structures behind the graphs, to make it as fast as possible. The best solution we ended up with was building the DFA on the fly, the same way it is built in good regular expressions compilers (see for example this article).

The advantage of the DFA approach is that there is no need to backtrack, once the DFA has been built. On the other hand, there are quite a number of disadvantages: it occupies significantly more memory than the trie; there is a significant cost for adding new bindings, since the entire DFA has to be dropped and rebuilt; and it is more complex and therefore harder to implement and maintain.

In the following articles we will present more details about the two structures, how they performed in benchmarks, their space and time complexities and the details behind the DFA optimizations that we have tried.

To be continued.

Management plugin – preview release

Tuesday, September 7th, 2010

The previously mentioned management plugin is now in a state where it's worth looking at and testing. In order to make this easy, I've made a special once-only binary release just for the management plugin (in future we'll make binary releases of it just like the other plugins). Download all the .ez files from here and install them as described here, then let us know what you think. (Update 2010-09-22: Note that the plugins referenced in this blog post are for version 2.0.0 of RabbitMQ. We've now released 2.1.0 - for this and subsequent versions you can get the management plugin from here).

After installation, point your browser at http://server-name:55672/mgmt/. You will need to authenticate as a RabbitMQ user (on a fresh installation the user "guest" is created with password "guest"). From here you can manage exchanges, queues, bindings, virtual hosts, users and permissions. Hopefully the UI is fairly self-explanatory.

The management UI is implemented as a single static HTML page which makes background queries to the HTTP API. As such it makes heavy use of Javascript. It has been tested with recent versions of Firefox, Chromium and Safari, and with versions of Microsoft Internet Explorer back to 6.0. Lynx users should use the HTTP API directly :)

The management plugin will create an HTTP-based API at http://server-name:55672/api/. Browse to that location for more information on the API. For convenience the documentation can also be obtained from our Mercurial server.

WARNING: The management plugin is still at an early stage of development. You should be aware of the following limitations:

  • Permissions are only enforced sporadically. If a user can authenticate with the HTTP API, they can do anything.
  • Installing the management plugin will turn on fine-grained statistics in the server. This can slow a CPU-bound server by 5-10%.
  • All sorts of other features may be missing or buggy. See the TODO file for more information.

Note: if you want to build the plugin yourself, you should be aware that right now the Erlang client does not work in the default branch, so you need a mix of versions. The following commands should work:

hg clone http://hg.rabbitmq.com/rabbitmq-public-umbrella
cd rabbitmq-public-umbrella
make checkout
hg update -r rabbitmq_v2_0_0 -R rabbitmq-server
hg update -r rabbitmq_v2_0_0 -R rabbitmq-codegen
hg update -r rabbitmq_v2_0_0 -R rabbitmq-erlang-client
hg clone http://hg.rabbitmq.com/rabbitmq-management
make
cd rabbitmq-management
make

Of course this will be fixed soon. (Ignore the above, this is fixed.)

Finally, this post would not be complete without some screenshots...