I want to know whether RabbitMQ is more scalable than other brokers or not?
If yes what are the specific reasons? If not how can we scale it up?
I am using rabbitmq for the first time with Spring framework.
Even a single RabbitMQ broker is ridiculously fast. A stock desktop machine can handle tens to hundreds of thousand of messages per second.
If one rabbit turns out to not be enough, RabbitMQ supports a form of light-weight clustering that's designed specifically to improve scalability. Basically, it allows you to create "logical" brokers that are made up of many physical brokers.
Related
Assume a VM with 4 cores. I have a docker image which has a web application that provides some REST services. I am using K8S to deploy this application on that VM. So, is there any difference if I use a single pod on the single VM vs mutiple pods on the same host, in terms of performance.
For people who don't know K8S, assume we have some application that provides some REST services. Is there any advantage of using multiple instances of such application in terms of a performance increase like increased rate of serving requests ?
Personally, I think the performance had better to run multiple pods on the same host. I don't know what web server you use, but the requests are processed by limited cpu time, though it has multiple processes or threads for work. Additionally it's more efficient to utilize cpu time during network I/O waiting in using multiple processes. In order to improve the throughput, you should increase the processes or instances to work horizontally, because the response time is getting slower as time past.
I was planning on using Mosca or Mosquitto brokers (due they are open source) in order to achieve a scalable architecture with messages queue replication to avoid losing a message not yet delivered by the broker in an eventual failure of the broker.
As I read, mosquitto is a mature and very stable solution with the capacity of horizontal scalability using bridges. But I couldn't find any plugin to write messages into a database (common to all brokers), so I think that this is a limitation since if we have i.e. two brokers load balanced and one of them die, then all the messages of this broker cannot be delivered until the broker recover.
Mosca in the other hand allows us to scale using Redis, and if the broker 1 die, then broker2 still can deliver messages because they are stored in a common database. And in that way I can use master-slave configuration of redis to avoid single point of failure.
So my questions are:
1) Is mosca a good choice for production?
2) Is it possible to use redis to allocate messages queues with mosquitto?
Horizontal scalability is incredibly hard to add as an feature to MQTT brokers, since it requires engineering for that scalability from the start. Also, just replicating queues for undelivered messages won't help for resilience or fault-tolerance.
Even if it would be easy to add, I would NOT go with redis, since it essentially loses messages: https://aphyr.com/posts/283-jepsen-redis
If you want horizontal scalability, I'd recommend to check out a broker that has clustering, horizontal (or better: linear) scalability built-in and does allow network splits.
Here is a series about MQTT and clustering: http://www.hivemq.com/blog/clustering-mqtt-introduction-benefits/
In project written in Erlang what are the best practices organizing connections to RabbitMQ?
I have a big number of long living Erlang processes, each of them needs to send/receive messages through RabbitMQ.
Shall I open connection in all of them or fix-sized pool is better?
Is there already a library for that task?
Maybe it's better to share even a channel?
One connection per process.
Use multiple channels within that connection. Generally speaking, 1 channel per message producer or consumer is a good place to start.
The important part, though, is that you only have 1 connection per process.
We operate two dual-node brokers, each broker having quite different queues and workloads. Each box has 24 cores (H/T) worth of Xeon E5645 # 2.4GHz with 48GB RAM, connected by Gigabit LAN with ~150μs latency, running RHEL 5.6, RabbitMQ 3.1, Erlang R16B with HiPE off. We've tried with HiPE on but it made no noticeable performance impact, and was very crashy.
We appear to have hit a ceiling for our message rates of between 1,000/s and 1,400/s both in and out. This is broker-wide, not per-queue. Adding more consumers doesn't improve throughput overall, just gives that particular queue a bigger slice of this apparent "pool" of resource.
Every queue is mirrored across the two nodes that make up the broker. Our publishers and consumers connect equally to both nodes in a persistant way. We notice an ADSL-like asymmetry in the rates too; if we manage to publish a high rate of messages the deliver rate drops to high double digits. Testing with an un-mirrored queue has much higher throughput, as expected. Queues and Exchanges are durable, messages are not persistent.
We'd like to know what we can do to improve the situation. The CPU on the box is fine, beam takes a core and a half for 1 process, then another 80% each of two cores for another couple of processes. The rest of the box is essentially idle. We are using ~20GB of RAM in userland with system cache filling the rest. IO rates are fine. Network is fine.
Is there any Erlang/OTP tuning we can do? delegate_count is the default 16, could someone explain what this does in a bit more detail please?
This is difficult to answer without knowing more about how your producers and consumers are configured, which client library you're using and so on. As discussed on irc (http://dev.rabbitmq.com/irclog/index.php?date=2013-05-22) a minute ago, I'd suggest you attempt to reproduce the topology using the MulticastMain java load test tool that ships with the RabbitMQ java client. You can configure multiple producers/consumers, message sizes and so on. I can certainly get 5Khz out of a two-node cluster with HA on my desktop, so this may be a client (or application code) related issue.
I read in forum that while implementing any application using AMQP it is necessary to use fewer queues. So would I be completely wrong to assume that if I were cloning twitter I would have a unique and durable queue for each user signing up? It just seems the most natural approach and if not assign a unique queue for each user how would one design something like that.
What is the most used approach for web messaging. I see RabbitHUb and Rabbit WebHooks but Webhooks doesn't seem to be a scalable solution. i am working with Rails and my AMQP server as running as a Daemon.
In RabbitMQ, queues are quite cheap. They're effectively lightweight Erlang processes, and you can run tens to hundreds of thousands of queues on a single commodity machine (i.e. my laptop). Of course, each will consume a bit of RAM, but unused-recently queues will hibernate, so they'll consume as little memory as possible. In addition, if Rabbit runs low on memory for messages, it will page old messages to disk.
The above only applies to a single machine. RabbitMQ supports a form of lightweight clustering. When you join several Rabbit nodes into a cluster, each can see the queues and exchanges on the other nodes but each runs only its own queues. So, you'll be able to have even more queues! (to the limit of Erlang clusters, which is usually a few hundred nodes) So, a cluster forms a logical broker distributed over several machines; clients connect to it and use it transparently through any of the nodes.
That said, having a single durable queue for each user seems a bit strange: in AMQP, you cannot browse messages while they're on the queue; you may only get/consume messages which takes them off the queue and publish which adds the to the end of the queue. So, you can use AMQP as a message router, but you can't use it as a sort of message database.
Here is a thread that just talks about that: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-February/003041.html