Pool of connection to RabbitMQ - erlang

In project written in Erlang what are the best practices organizing connections to RabbitMQ?
I have a big number of long living Erlang processes, each of them needs to send/receive messages through RabbitMQ.
Shall I open connection in all of them or fix-sized pool is better?
Is there already a library for that task?
Maybe it's better to share even a channel?

One connection per process.
Use multiple channels within that connection. Generally speaking, 1 channel per message producer or consumer is a good place to start.
The important part, though, is that you only have 1 connection per process.

Related

is there restriction for opening imap connection from same ip address?

Hi I am implementing Email Client Application. My requirement is i need to monitor all the mailboxes available in specified IMAP server. I am created separate TCP Connection for each mailboxes. But i am getting disconnected from IMAP Server. I am trying Gmail/yahoo for my testing purpose. Is there any restriction to open multiple connection from same ip to particular IMAP Server? Particularly in Gmail and Yahoo.
or is there anyway to Monitor all the mailboxes in Single Connection without using IMAP-NOTIFY seems it does not supported in both Gmail/Yahoo...
Please Help me out...
This is something which I have answered on stackoverflow before, but which is now only available via the wayback machine. The question was about how to "kill too many parallel IMAP connections". Reprinted below; the core takeaway message is that for some reason, most server administrators prefer to have smaller number of short-lived connections instead of more connections which are active over longer period of time, yet they spend most of their time silently idling in the background. What they do not get is that the IMAP protocol is designed with long-lived connections in mind, and trying to prevent that will lead to wasting resources because the clients will constantly resync mailboxes as they are hopping among them.
The original answer follows:
Nope, it's a very wrong idea. IMAP is designed so that monitoring a single mailbox takes one connection; in most IMAP server implementations, this means a single process. However, unless the client the user is using is terribly broken, all these connections enter the IDLE mode. In IDLE, the clients are passively notified about any updates to the mailbox state. If you disable these connections, the clients would have to activelly poll for changes in many mailboxes. Now decide for yourself -- what is worse, having ten processes sitting idle, or one process doing heavy polling every two minutes? Which of these solutions would consume more energy, CPU time and IO operations? That's for the number of parallel connections.
The second question was about the long-lived connections. Again, this is a critical aspect of IMAP -- each connection carries a lot of associated state information which is rather expensive to obtain. Unless your server implements certain extensions and your clients use them (ESEARCH, CONDSTORE, QRESYNC are the crucial bits), opening a mailbox can require O(n) operations. I don't know how many messages your users have, but do you really want to transfer e.g. message flags for 250k messages when you decided to kill a connection because it has been active for "too long"?
Finally, any reasonable IMAP server vendor offers a way to configure a per-user session limit on the number of concurrent processes. Using that is much better than maintaining a script for ad-hoc killing of "unused" connections.
If you would like to learn more about the synchronization process, my thesis about using IMAP on clients with flaky network and limited resources describes what the clients have to do in order to show an updated view of mailboxes to their users.

MPI: How many sockets?

I am working on an MPI application, which uses threaded-MPI calls between processes. Threads are added and removed as per the load requirements. Now, I have a question, which I could not find answer in the open-mpi forum.
If a set of MPI processes ("ranks") already has a connection, ie, they are already making send-receive calls, and then a new thread comes in (either processes) which also makes the send-receive calls between the same MPI peers, would MPI open up new set of sockets?
I know that the details are implementation dependent, so there may not be a general answer. But, is there a way to find out?
There are questions on the scalability of this technique, which was chosen for other reasons. It would be great to get some stats, on the number of new sockets per connection.
Anyone knows how to do this? For instance, query which socket is a particular instance of MPI_Send writing to?
I already tried adding --mca btl self,sm,tcp --mca btl_base_verbose 30 -v -report-pid -display-map -report-bindings -leave-session-attached
Thanks a lot.
To answer my own question, here is what I learnt from brilliant folks at Open-MPI:
On Jan 24, 2012, at 5:34 PM, devendra rai wrote:
I am trying to find out how many separate connections are opened by MPI as messages are sent. Basically, I have threaded-MPI calls to a bunch of different MPI processes (who, in turn have threaded MPI calls).
The point is, with every thread added, are new ports opened (even if the sender-receiver pairs already have a connection between them)?
In Open MPI: no. The underlying connections are independent of how many threads you have.
Is there any way to find out? I went through MPI APIs, and the closest thing I found was related to cartographic information. This is not sufficient, since this only tells me the logical connections (or does it)?
MPI does not have a user-level concept of a connection. You send a message, a miracle occurs, and the message is received on the other side. MPI doesn't say anything about how it got there (e.g., it may have even been routed through some other process).
Reading Open MPI FAQ, I thought adding "--mca btl self,sm,tcp --mca btl_base_verbose 30 -display-map" to mpirun would help. But I am not getting what I need. Basically, I want to know how many ports each process is accessing (reading as well as writing).
For Open MPI's TCP implementation, it's basically one TCP socket per peer (plus a few other utility fd's). But TCP sockets are only opened lazily, meaning that we won't open the socket until you actually send to a peer.
--
Jeff Squyres
Credits to Jeff Squyres.

What's the best way to 'ping' thousands of servers every minute?

I run a server monitoring site for a video game. It monitors thousands of servers (currently 15,000 or so).
My current setup is a bit janky, and I want to improve it. Currently I use cron to submit every server to a resque job queue. I refill the queue just as soon as it's empty, essentially creating a constantly working queue. The job will then simply try and open a socket connection to the server ip and port in question, and mark it down if it fails to connect.
I have 20 workers, and it gets the job done in about 5 minutes. I feel that this should be able to go MUCH faster.
Is there a better, quicker way of doing this?
So, what you are doing currently I assume is doing a TCP socket connection which pings your game server. The problem with using TCP is obviously that it is a lot slower than UDP.
What I would advise instead is creating a UDP socket that just checks for the game server port.
Here's a nice quote from another question:
> UDP is really faster than TCP, and the simple reason is because
> it's non-existent acknowledge packet (ACK) that permits a continuous
> packet stream, instead of TCP that acknowledges each packet.
Read this question here: UDP vs TCP, how much faster is it?
From my experience with game servers, the majority if not 100% of all modern game servers allow you to query them on a UDP socket. This will then respond with details on the game server. (I used to host a lot of servers myself too).
So basically, make sure that you are using UDP rather than TCP...
Example Query
I'm just searching for this information now and will update my question...when I find some source.. what game is it that you are trying to get information for?
Use typical solutions for typical tasks. This case is about available detection every n seconds - one of daily sysadmin task. It should not be over ICMP, use SNMP over UDP proto. One of complete solution is Nagious/Cacti/Zabbix, which have built-in functionality to combine everything about your servers: LA, HDD, RAM, IO, NET as well as available detection.
You don't mention how you are making the socket connections, but you might want to try using ruby curl bindings: curb instead of net/http.
This will typically be much faster.

Is it necessary to use as few queues as possible And solutions for web messaging

I read in forum that while implementing any application using AMQP it is necessary to use fewer queues. So would I be completely wrong to assume that if I were cloning twitter I would have a unique and durable queue for each user signing up? It just seems the most natural approach and if not assign a unique queue for each user how would one design something like that.
What is the most used approach for web messaging. I see RabbitHUb and Rabbit WebHooks but Webhooks doesn't seem to be a scalable solution. i am working with Rails and my AMQP server as running as a Daemon.
In RabbitMQ, queues are quite cheap. They're effectively lightweight Erlang processes, and you can run tens to hundreds of thousands of queues on a single commodity machine (i.e. my laptop). Of course, each will consume a bit of RAM, but unused-recently queues will hibernate, so they'll consume as little memory as possible. In addition, if Rabbit runs low on memory for messages, it will page old messages to disk.
The above only applies to a single machine. RabbitMQ supports a form of lightweight clustering. When you join several Rabbit nodes into a cluster, each can see the queues and exchanges on the other nodes but each runs only its own queues. So, you'll be able to have even more queues! (to the limit of Erlang clusters, which is usually a few hundred nodes) So, a cluster forms a logical broker distributed over several machines; clients connect to it and use it transparently through any of the nodes.
That said, having a single durable queue for each user seems a bit strange: in AMQP, you cannot browse messages while they're on the queue; you may only get/consume messages which takes them off the queue and publish which adds the to the end of the queue. So, you can use AMQP as a message router, but you can't use it as a sort of message database.
Here is a thread that just talks about that: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-February/003041.html

Is RabbitMQ more scalable than JMS outbound queue?

I want to know whether RabbitMQ is more scalable than other brokers or not?
If yes what are the specific reasons? If not how can we scale it up?
I am using rabbitmq for the first time with Spring framework.
Even a single RabbitMQ broker is ridiculously fast. A stock desktop machine can handle tens to hundreds of thousand of messages per second.
If one rabbit turns out to not be enough, RabbitMQ supports a form of light-weight clustering that's designed specifically to improve scalability. Basically, it allows you to create "logical" brokers that are made up of many physical brokers.

Resources