parallel asynchronous processing with callbacks in rails controller - ruby-on-rails

I am making a rails app and I am wondering whether it is possible to setup an asynchronous/callback architecture in the controller layer. I am trying to do the following:
When a HTTP request is made to /my_app/foo, I want to asynchronously dish out two jobs - a naive ranking job and a complicated ranking job both of which rank 1000 posts - to several worker machines. I want to setup a callback method in the controller for each job which is called when the respective job is finished. If the complicated job does not return within X milliseconds, I want to return the output from the naive job. Otherwise, I want to return the output from the complicated job.
It is important to note that I want these jobs to performed in parallel. What is the best way to implement such a system in Rails? I am using Apache Phusion Passenger as my rails server if that helps.
Thanks.

Sounds like you should be using background jobs. In that case, when a request comes in, you would start / queue two jobs which would be picked up and processed by a worker, which acts independently of your Rails app.
Here a few links that could be of help:
https://www.ruby-toolbox.com/categories/Background_Jobs
http://railscasts.com/episodes/171-delayed-job
http://railscasts.com/episodes/243-beanstalkd-and-stalker
http://railscasts.com/episodes/271-resque
http://rubyrogues.com/queues-and-background-processing/

It's possible to issue several HTTP request asynchronously in Rails. However, it's impossible to make Rails event-driven.
In can send several HTTP request asynchronously with libraries such as Typhoeus. However, you might have concurrency issue if your timeout is too long.
Otherwise, you can try some event-driven web framework such as Cramp and Goliath. They are both based on EventMachine, so you can try em-http-request.

Try using rabbitmq where you can post a message on queue and expect the response in reply queue. The queue consumer can be even implemented in Scala for fastness. amqp gem would suffice what I am saying. Rails controller with amqp binding would be even more nice if possible(I am exploring that option having endpoints with amqp binding instead of http). That would solve enough no of problems

Related

What to do with slow soap request?

In my controller has a slow soap-request (receive the data from third party service). So the page renders only after the request ends.
I would like to first renders page. The data from the query I can wait.
What better to do in this case?
sidekiq, ajax?
my recommendation, is using active job that you can read more from rails guide activejob and adding to the queue then for the worker you can choose between delayjob, resque, and sidekiq, I personally choose resque because it's has web interface to monitor the worker that execute the queue and easy to setup with redis and systemd configuration, here is some info about pro and cons between those three
and may be this railscast episode 366 can help you how you can combine with your project

Are there existing gems or services for web hook implementations?

Most major services like github provide Webhooks functionality.
So, with github - you can set hooks to notify you on every commit.
In the same time web hooks are not that easy.
Each web hook has to be ran asynchronously to not block web server at the time of communicating with destination. And it can take a good time (10-15 seconds). There should be implemented repeating functionality (in case if destination is not responding).
So, I think that there for sure should be some service or library which will do this for you.
Do you know any of these?
I need to send data to lots of endpoints and to receive a response from them..
You need a gem providing background job functionality. Sidekiq and Delayed Job are ones of most frequently used.
Idea is that after request (in ruby on rails you can use after_action hook or just do it in controller action) you create a job which will be executed asynchronously. Put logic you need in the job class
Both sidekiq and delayed job have repeating functionality, just pick gem that looks simpler to use
There is a gem called ActiveHook but it does not appear to be maintained anymore.
Benedikt Deicke wrote a good article on sending webhooks with Rails, you should check it.

Triggering a SWF Workflow based on SQS messages

Preamble: I'm trying to put together a proposal for what I assume to be a very common use-case, and I'd like to use Amazon's SWF and SQS to accomplish my goals. There may be other services that will better match what I'm trying to do, so if you have suggestions please feel free to throw them out.
Problem: The need at its most basic is for a client (mobile device, web server, etc.) to post a message that will be processed asynchronously without a response to the client - very basic.
The intended implementation is to for the client to post a message to a pre-determined SQS queue. At that point, the client is done. We would also have a defined SWF workflow responsible for picking up the message off the queue and (after some manipulation) placing it in a Dynamo DB - again, all fairly straightforward.
What I can't seem to figure out though, is how to trigger the workflow to start. From what I've been reading a workflow isn't meant to be an indefinite process. It has a start, a middle, and an end. According to the SWF documentation, a workflow can run for no longer than a year (Setting Timeout Values in SWF).
So, my question is: If I assume that a workflow represents one message-processing flow, how can I start the workflow whenever a message is posted to the SQS?
Caveat: I've looked into using SNS instead of SQS as well. This would allow me to run a server that could subscribe to SNS, and then start the workflow whenever a notification is posted. That is certainly one solution, but I'd like to avoid setting up a server for a single web service which I would then have to manage / scale according to the number of messages being processed. The reason I'm looking into using SQS/SWF in the first place is to have an auto-scaling system that I don't have to worry about.
Thank you in advance.
I would create a worker process that listens to the SQS queue. Upon receiving a message it calls into SWF API to start a workflow execution. The workflow execution id should be generated based on the message content to ensure that duplicated messages do not result in duplicated workflows.
You can use AWS Lambda for this purpose . A lambda function will be invoked by SQS event and therefore you don't have to write a queue poller explicitly . The lambda function could then make a post request to SWF to initiate the workflow

Process job using workers while client waits and return response when complete

I'm building an API using Rails where requests come in and they need to be executed by a cluster of workers running on a different server (these workers call remote APIs and parse the data, etc...). I'm going to be using Sidekiq or Resque to handle the queueing/processing of that.
My issue is the client needs to wait while this is happening and the controller needs to return the response to the client once it's complete. How would I handle this in the controller? We're using a redis backend, so I was thinking something along the lines of subscribing to a pub/sub channel and waiting for the worker to publish a status message. The controller would wait for a set time period and then return a 'check back later' response to the client if it doesn't receive a message in time. What would be the best way to implement that, or is there a better solution?
Do not make your clients wait! There are a lot of issues if you make the controller block for a long running job:
Other programs may assume the request timed out (proxies, browsers, scripts, etc.)
It makes your API endpoints become a source for denial of service
It requires you to put more engineering work into web servers (since a rails process can't handle another web request while it's handling the blocking call)
Part of the reason of using Sidekiq or Resque is the avoid controllers that do heavily lifting during the http request.
Instead, background jobs should report their status to the database. Then web server should query and return to the client the latest status from the database.
If clients need more immediate feedback, you can:
make clients constantly poll
post request to the client (if the API consumer is another webserver)
use another protocol mechanism (eg - websockets).

Suggestions for how to write a service in Rails 3

I am building an application which will send status requests to users (via email & sms) on a regular basis. I want to execute the service each hour which will:
Query the database for all requests that need to be sent (based on some logic)
Send the requests through Amazon's Simple Email Service (this is already working)
Write a record of the status request notification back to the data store
I am considering wrapping up this series of operations into a single controller with an end point that can be called remotely to kick off the process within the rails app.
Longer term, I will break this process out into an app that can be run independently of my rails app, but for now I'm just trying to keep it simple.
My first inclination is to build the following:
Controller with the following elements:
A method which will orchestrate the steps outlined above (and can be called externally)
A call to the status_request model which will bring back a collection of request needing to be sent
A loop to iterate through the pending requests, which will:
Make a call to my AWS Simple Email Service module to actually send the email, and
Make a call to the status_request model to log the request back to the database
Model:
A method on my status_request model which will bring back a collection of requests that need to be sent
A method in my status_request model which will log that a notification was sent
Since this will behave as a service that gets called periodically from an outside scheduler I don't think I'll need a view for this operation. (Will, of course, need views to show users and admins what requests have been sent, but that's later...).
As someone new to Rails, I'm asking for review of this approach and any suggestions you may have.
Thanks!
Instead of a controller which Jeff pointed out exposes a security risk, you may just want to expose a rake task and use cron to invoke it on an hourly basis.
If you are still interested in building a controller, look at devise gem and its single access token, token_authenticatable, for securing the methods you are exposing.
You may also want to look at delayed_job or resque to offload the call to status_request and the loop to AWS simple service to a background worker process.
You may want a seperate controller and view for the log file so you can review progress on demand.
And if you want to get real fancy use Amazon SNS to send you alerts when the service reaches some unacceptable level of failures, backlog, etc.
Since you are trying to invoke this from an outside process, your approach should work. You could also have a worker process that processes task when they are there.
You will need routes to expose your service, and you may want to also make security decisions. How will the service that invokes your application authenticate so all others can't hit it at will?
Another consideration should be how many emails are you sending. If there are enough, we may want to look into the fact that writing this sort of loop is going to be extremely top heavy; and may affect users on the current system if it's a web application.
In the end, there are many ways to do this. I would focus on the performance/usage you expect as well as security. There's never one perfect way to solve a problem like this, and your way should just be aware of the variables it will need to be operating within.
Resque and Redis might be helpful to you in scheduling and performing operatio n .They are simple and superfast, [here](http://railscasts.com/episodes/271-resque] is a simple tut on same.

Resources