Rails message queue - Alternative to RabbitMQ - ruby-on-rails

I have 4 Rails apps: App1, App2, App3 and App4.
App2, App3 and App4 needs to update the same information (an email address for example, something as simple as that) when it has been updated on App1.
For now, and since the data I need to sync are not often changed, I went by using my already-in-place Sidekiq implementation to send a post request to the other Apps within a background job. But that definitly does not sounds as an ideal solution. This change is still in it's own branch and has not been yet merge on production. That's why I'm here for.
I've been tackling RabbitMQ as I read it definitely has this capabilities. However, RMQ sound a bit overkill for my use case, IMHO. So I would like to know how would you guys go for? Is there any other lightweight alternative ?

If you want to take the RMQ route you have to do clustering and federation which is very bloated for this kind of a simple problem. You can implement your own worker queue using Redis which Sidekiq is already using it as a message queue. Use RabbitMQ only for critical real time messaging queue systems.
RPOPLPUSH - Redis

Related

Extract, transform, load within Rabbit?

One of the things that i do pretty often is transforming SQL data into cache and document-based stores, for performance reasons. I don't want my frontend applications hitting my database, so i have high-speed cache solutions, as well efficient Solr and other solutions.
I use RabbitMQ as the central communication hub to achieve this ETL flow, which looks like this: Backend application sends a message to Rabbit with the new data, or changes made into existing data. I then have a node.js script which consumes the queue, makes small batches of data and populates all the necessary systems: Redis, Mongo, Solr, etc.
However, i'm wondering if there's a better way of doing this. Maybe Rabbit has some kind of scripting support to create erlang logic for queues?
However, i'm wondering if there's a better way of doing this. Maybe Rabbit has some kind of scripting support to create erlang logic for queues?
it doesn't. it's just a message queueing system.
personally, I think your current design sounds good.
The only thing I would wonder, is whether or not each of your target systems has a queue of it's own. That way, any one of them can go down and not affect the others.
I would probably do something like this:
back-end produces data message and sends through RMQ
RMQ is configured with a fanout exchange, and has one bound queue per target system
each system receives the message in it's own queue
otherwise, what you have sounds about right to me!

Parallel asynchronous requests in SOA using a messaging broker

I've been looking at an SOA using a messaging broker (rabbitmq / rails), however there are still a few niggles I cant get my head around.
If I wanted to run parallel requests as you would using something like Typhoeus with http
a) how in an asynchronous system like this - when you have potentially multiple threads publishing to the same topic exchange do you connect the response message with your request - would you add a unique routing key?
c) what would be the best way initiating and managing multiple parallel calls of this nature in ruby?
Many thanks
In answer to a), yes you use a routing key, or in the parlance of messaging, a correlation identifier.
In answer to c), sorry I haven't a clue about Ruby, but messaging by nature supports parallelism by using queues to manage throughput. I assume that whatever broker you choose would provide the appropriate samples and tooling for your needs.
I would use at sidekiq or rescue for jobs like that. If your system is larger and distributed you can create a module/class which takes your job including key as argument, sends it to rabbitmq, some worker which is subscribed to fan out or channel picks it up and sends the result back as POST to your app (web hook approach).
For simplicity you can also just put some sort of Ajax spinner on your view and poll every 10 seconds or whatever suits you if the result is back. For sure you should have some kind of id for every job. If you have questions about it I could elaborate more. My apps crunch a lot if data in long running tasks with up to 500,000,000 items in rabbit queues.

Notify multiple Rails app when pgsql database changes

I have two Rails apps (one for web, and one for backend) accessing to the same PGSQL database. I would like to notify the other app if one app changes a table in the database.
How should I go about it?
I think this depends on:
How reliable you need it to be.
How fast you need the notifications to be delivered.
The FAYE solution suggested by #techvineet provides a good fast but unreliable option. (N.b. I don't mean it'll fail often, but it likely will occasionally, maybe 1/1000, If that causes you a problem, then avoid)
If you need something 100% reliable, and speed isn't important, you could write audit events to the database, and then poll that table from each app, if these are committed in the same transaction as the actual work is done, you should be safe... But it'll be as slow as your polling cycle.
Lastly, you if you want something fast AND reliable, then you could look at using something like ActiveMQ or RabbitMQ to give you reliable messaging between the applications to notify changes. You'll need a worker process in each app to listen to changes and deal with them appropriately.
My last comment would be that this 'smells' a little. The fact that you're trying to do this makes me think the architecture of your app might need looking at in the longer term. An obvious way of doing it might be to encapsulate all the business logic into an app which exposes an API, and then calling that API from both front and back end applications.
You can try using Faye http://faye.jcoglan.com/, which is a publish and subscribe messaging server. It can be integrated with Rails https://github.com/jamesotron/faye-rails.git. Messages can be transferred from one app to another by subscribing to the messages and publishing.
Hope this will help.

RabbitMQ with EventMachine and Rails

we are currently planning a rails 3.2.2 application where we use RabbitMQ. We would like to run several kind of workers (and several instances of a worker) to process messages from different queues. The workers are written in ruby and are laying in the lib directory of the rails app.
Some of the workers needs the rails framework (active record, active model...) and some of them don't. The first worker should be called every minute to check if updates are available. The other workers should process the messages from their queues when messages (which are send by the first worker) are present and do some (time consuming) stuff with it.
So far, so good. My problem is, that I only have little experiences with messaging systems like RabbitMQ and no experiences with the rails interaction between them. So I'm wondering what the best practices are to get the two playing with each other. Here are my requirements again:
Rails 3.2.2 app
RabbitMQ
Several kind of workers
Several instances of one worker
Control the amount of workers out of rails
Workers are doing time consuming tasks, so they have to be async
Only a few workers needs the rails framework. The others are just ruby files with some dependencies like Net or File
I was looking for some solution and came up with two possibilities:
Using amqp with EventMachine in a new thread
Of course, I don't want my rails app to be blocked when a new worker is created. The worker should run in another thread and do its work asynchronously. And furthermore, it should not start a new instance of my rails application. It should only require the things the worker needs.
But in some articles they say that there are some issues with Passenger. And another fact that I don't like is, that we are using webbrick for development and we ought to include workarounds for that too. It would be possible to switch to another webserver like thin, but I don't have any experience with that either.
Using some kind of daemonizing
Maybe its possible to run workers as a daemon, but I don't know how much overhead this would come up with, or how I can control the amount of workers.
Hope someone can advise a good solution for that (and I hope I made myself clear ;)
It seems to me that AMQP is a big shot to kill your problem. Have you tried to use Resque? The backed Redis database has some neat features (like publish/subscribe and blocking list pop) which make it very interesting as a message queue, and Resque is very easy to use in any Rails app.
The workers are daemonized, and you decide which worker of your pool listens to which queue, so you can scale each type of job as needed.
Using EM reactor inside a request/response cycle is not recommended, because it may conflict with an existing event loop (for instance if your app is served by thin), in any case you have to configure it specifically for your web server, OTOS it may be interesting to have an evented queue consumer, if your jobs have blocking IO and are not processor-bound.
If you still want to do it with AMQP, see Starting the event loop and connecting in Web applications and configure for your web server accordingly. Or use bunny to push synchronously in the queue (and whichever job consumer you deam useflu, like workling for instance)
we are running slightly different -- but similar technology stack.
daemon kit is used for eventmachine side of the system... no rails, but shared models (mongomapper & mongodb). EM is pulling messages off the queues, and doing whatever logic is required (we have ruleby in the mix, but if-then-else works too).
mulesoft ESB is our outward-facing message receiver and sender that helps us deal with the HL7/MLLP world. But in v1 of the app, we used some java code in ActiveMQ to manage HL7 messages.
the rails app then just serves up stuff for the user to see -- again, using the shared models.

How is request processing with rails, redis, and node.js asynchronous?

For web development I'd like to mix rails and node.js since I want to get the best out of both worlds (rails for fast web development and node for concurrency). I know that some people choose to just use full ruby stack with eventmachine that is integrated into rails controller so that every request can be nonblocking by using fiber in event-loop model. I have been able to understand how that works in a big picture.
At this moement however I want to try doing nonblocking request processing with rails and node.js with message queue concept. I heard that this can be achieved by using redis as an intermediary. I'm still having trouble trying to figure out how that works as of now. From what I can understand: so we have 2 apps A (rails) and B (node.js) and redis. rails app will handle requests from users that go through controllers in REST manner, and then from there rails will pass that through redis, and then redis will form queues and node.js app will pick up that queue and do whatever necessary afterhand (write or read from backend db).
My questions:
So how would that improve concurrency and scalability? from what i
know since rails handle the requests through controllers
synchronously, and then write to redis, the requests will be
blocking still, even though node.js end can pickup the queue
asynchronously. (I have a feeling that it's not asynchronous yet if it's not end to end
non-blocking).
Would node.js be considered a proxy or an application here if redis
is the intermediary?
I'm new to redis and learning it still. If I'm using 100% noSQL
solution for my backend database, such as mongoDB or couchDB, are they replaceable by redis entirely or is redis more seen as a
messaging queue tool like rabbitMQ?
Is messaging queue a different concurrency concept than threading or
event-loop model or is it supposed to supplement them?
That's all my question. I'm new to message queue concept. Will appreciate any help and pointers to right direction and articles that help me learn more. thanks.
You are mixing some things here that don't go together.
Let's first make sure we are on the same page regarding the strengths/weaknesses of the involved technologies
Rails: Used for it's web-development simplicity and perfect for serving database-backed web-applications.
Not very performant when having to serve a large number of long running requests as you'd run out of threads on your Ruby workers - but well suited for anything that can scale horizontally with more web-nodes (multiple web-servers - 1 db).
Node.js: Great for high-concurrency scenarios. Not as easy as rails to write a regular web-application in it. But can handle near an insane amount of long-running low-cpu tasks efficiently.
Redis: A Key-Value Store that supports operations on it's data-structures (increment/decrement values, append/prepent push/pop to lists - all operations that make this DB work consistently with multiple clients writing at once)
Now as you can see, there is no benefit in having Rails AND Node serve the same request - communicating through Redis. Going through the Rails Stack would not provide any benefit if the requests ends up being handled by the Node server.
And even if you only offload some processing to the node server, it's still the Rails webserver that handles the requests and has to wait for a response from node - killing the desired scalability. It simply makes no sense.
Where you would a setup with Node and Rails together is in certain areas of your app that have drastically different scaling requirements.
If you are for example writing a Website that displays live stats for Football games you can easily see that there are two different concerns in your app: The "normal" Site that contains signup, billing and profile stuff that screams for a quick implementation through rails. And the "live" portion of the site where users see live results and you expect to handle a lot of clients at once - all waiting for something to happen (low cpu - high concurrency).
In such a case it may be beneficial to actually seperate the two parts of the site into a Ruby and a Node app, with then sharing data about the user through a store like Redis (but actually you just need some shared state that both can look at and write to for synchronization purposes).
So you would use for example Rails for the Signup/Login portions - once signed up write the session cookie into redis alongside with the permissions of the user (what game is he allowed to follow) and hand the user off to the Node.js app.
There the Node app can read the session information from Redis and serve the user.
Word of advice:
You don't get scalability by simply throwing Node.js into your Toolbox. You really have to find out what Node.js is good at (low-cpu high-io concurrent operations) and how you can leverage that to remedy some of the problems your currently chosen technology has.
I can answer 3 for you. Redis does not guarantee that when you perform an operation that result will actually be on disk, also transaction handling it a bit "different". It also requires for the whole database to be in memory. Depending on the situation this can be an issue or not. It is however incredibly fast. It is not a messaging queue, you can easily make a queue out of it, but it is not it's purpose. If you want to have a queuing system only you can probably do better with something else.

Resources