Rails API, microservices, async/deferred responses - ruby-on-rails

I have a Rails API which can handle requests from the clients. Clients use that API to perform analysis of their data. Client POSTs the data to API, API checks if that data have been analysed before. If so API just respond with analysis result. If the data haven't been analyzed before API:
Tells client that analysis started.
Establishes the connection with analyzing microservice.
Performs asynchronous (or deferred or i don't know) request to the analyzing microservice and waiting for response. The analysis takes much time so neither the API nor the microservice should be blocked while doing it.
When the response from analyzing microservice is returned API hands it to the client.
The main issue for me is to set up things such way that client could receive somehow the message "Your data had been sent to analysis" right after he performed the request. And then when analysis will be done client could receive its result.
The question is what approach I have to use in that case? Async responses, deferred responses, something else? And what known solutions could help me with that? Any gems?
I'm new to that stuff so I'm really sorry if I ask dumb questions.

If using HTTP you can only have one response to every request. To send multiple responses, i.e. "work in progress", then later the "results", you would need to use a different protocol, e.g. web sockets.
Since HTTP is so very common I'd stick with that in combination with background jobs. There are a couple of options which spring to mind.
Polling: The API kicks off a background jobs (to call the microservice) and responds to the client with a URL which the client can ping periodically for the result. The URL would respond with some kind of "work in progress" status until the result is actually ready). The URL would need to include some kind of id so the API can lookup the background job.
The API would potentially have two URLS; /api/jobs/new and /api/jobs/<ID>. They would, in Rails, map to a controller new and show action.
Webhooks: Have the client include a URL of its own in the request. Once the result is available have the background job hit the given URL with the result.
Either way, if using HTTP, you will not be able to handle the whole thing within a request/response, you will have to use some kind of background processing (so request to the microservice happens in a different process). You could look at Sidekiq, for example.
Here is an example for polling:
URL: example.com/api/jobs/new
web app receives client request
generates a unique id for the request, SecureRandom.uuid.
starts a background job (Sidekiq) passing in the uuid and any other parameters needed
respond with URL such as example.com/api/jobs/
--
background job
sends request to microservice API and waits for response
saves result to database with uuid
--
URL: example.com/api/jobs/UUID
look in database for UUID, if not found respond that job is "in progress". If found return result found in database.

Depending on what kind of API you use. I assume your clients interact via HTTP.
If you want to build an asynchronous API over HTTP the first thing that you should do: accept the request, create a job, handle it in the background and immediately return.
For the client to get the response you have to 2 options:
Implement a status endpoint where clients can periodically poll the status of the job
Implement a callback via webhooks. So the client has to provide a URL which you then call after you're done.
A good start for background processing is the sidekiq gem or more general ActiveJob that ships with Rails.

Related

HTTP disconnect/timeout between request and response handling

Assume following scenario:
Client is sending HTTP POST to server
Request is valid and
have been processed by server. Data has been inserted into database.
Web application is responding to client
Client meets timeout
and does not see HTTP response.
In this case we meet situation where:
- client does not know if his data was valid and been inserted properly
- web server (rails 3.2 application) does not show any exception, no matter if it is behind apache proxy or not
I can't find how to handle such scenario in HTTP documentation. My question are:
a) should client expect that his data MAY be processed already? (so then try for example GET request to check if data has been submitted)
b) if not (a) - should server detect it? is there possibility to do it in rails? In such case changes can be reversed. In such case i would expect some kind of expection from rails application but there is not...
HTTP is a stateless protocol: Which means by definition you cannot know on the client side that the http-verb POST has succeeded or not.
There are some techniques that web applications use to overcome this HTTP 'feature'. They include.
server side sessions
cookies
hidden variables within the form
However, none of these are really going to help with your issue. When I have run into these types of issues in the past they are almost always the result of the server taking too long to process the web request.
There is a really great quote to that I whisper to myself on sleepless nights:
“The web request is a scary place, you want to get in and out as quick
as you can” - Rick Branson
You want to be getting into and out of your web request in 100 - 500 ms. You meet those numbers and you will have a web application that will behave well/play well with web servers.
To that end I would suggest that you investigate how long your post's are taking and figure out how to shorten those requests. If you are doing some serious processing on the server side before doing dbms inserts you should consider handing those off to some sort of tasking/queuing system.
An example of 'serious processing' could be some sort of image upload, possibly with some image processing after the upload.
An example of a tasking and queuing solution would be: RabbitMQ and Celery
An example solution to your problem could be:
insert a portion of your data into the dbms ( or even faster some NoSQL solution )
hand off the expensive processing to a background task.
return to the user/web-client. ( even tho in the background the task is still running )
listen for the final response with ( polling, streaming or websockets) This step is not a trivial undertaking but the end result is well worth the effort.
Tighten up those web request and it will be a rare day that your client does not receive a response.
On that rare day that the client does not receive the data: How do you prevent multiple posts... I don't know anything about your data. However, there are some schema related things that you can do to uniquely identify your post. i.e. figure out on the server side if the data is an update or a create.
This answer covers some of the polling / streaming / websockets techniques you can use.
You can handle this with ajax and jQuery as the documentation of complete callback explains below:
Complete
Type: Function( jqXHR jqXHR, String textStatus )
A function to be called when the request finishes (after success and error callbacks are executed). The function gets passed two arguments: The jqXHR (in jQuery 1.4.x, XMLHTTPRequest) object and a string categorizing the status of the request ("success", "notmodified", "error", "timeout", "abort", or "parsererror").
Jquery ajax API
As for your second question, is their away to handle this through rails the answer is no as the timeout is from the client side and not the server side however to revert the changes i suggest using one of the following to detect is the user still online or not
http://socket.io/
websocket-rails

How to dynamically and efficiently pull information from database (notifications) in Rails

I am working in a Rails application and below is the scenario requiring a solution.
I'm doing some time consuming processes in the background using Sidekiq and saves the related information in the database. Now when each of the process gets completed, we would like to show notifications in a separate area saying that the process has been completed.
So, the notifications area really need to pull things from the back-end (This notification area will be available in every page) and show it dynamically. So, I thought Ajax must be an option. But, I don't know how to trigger it for a particular area only. Or is there any other option by which Client can fetch dynamic content from the server efficiently without creating much traffic.
I know it would be a broad topic to say about. But any relevant info would be greatly appreciated. Thanks :)
You're looking at a perpetual connection (either using SSE's or Websockets), something Rails has started to look at with ActionController::Live
Live
You're looking for "live" connectivity:
"Live" functionality works by keeping a connection open
between your app and the server. Rails is an HTTP request-based
framework, meaning it only sends responses to requests. The way to
send live data is to keep the response open (using a perpetual connection), which allows you to send updated data to your page on its
own timescale
The way to do this is to use a front-end method to keep the connection "live", and a back-end stack to serve the updates. The front-end will need either SSE's or a websocket, which you'll connect with use of JS
The SEE's and websockets basically give you access to the server out of the scope of "normal" requests (they use text/event-stream content / mime type)
Recommendation
We use a service called pusher
This basically creates a third-party websocket service, to which you can push updates. Once the service receives the updates, it will send it to any channels which are connected to it. You can split the channels it broadcasts to using the pub/sub pattern
I'd recommend using this service directly (they have a Rails gem) (I'm not affiliated with them), as well as providing a super simple API
Other than that, you should look at the ActionController::Live functionality of Rails
The answer suggested in the comment by #h0lyalg0rithm is an option to go.
However, primitive options are.
Use setinterval in javascript to perform a task every x seconds. Say polling.
Use jQuery or native ajax to poll for information to a controller/action via route and have the controller push data as JSON.
Use document.getElementById or jQuery to update data on the page.

Process job using workers while client waits and return response when complete

I'm building an API using Rails where requests come in and they need to be executed by a cluster of workers running on a different server (these workers call remote APIs and parse the data, etc...). I'm going to be using Sidekiq or Resque to handle the queueing/processing of that.
My issue is the client needs to wait while this is happening and the controller needs to return the response to the client once it's complete. How would I handle this in the controller? We're using a redis backend, so I was thinking something along the lines of subscribing to a pub/sub channel and waiting for the worker to publish a status message. The controller would wait for a set time period and then return a 'check back later' response to the client if it doesn't receive a message in time. What would be the best way to implement that, or is there a better solution?
Do not make your clients wait! There are a lot of issues if you make the controller block for a long running job:
Other programs may assume the request timed out (proxies, browsers, scripts, etc.)
It makes your API endpoints become a source for denial of service
It requires you to put more engineering work into web servers (since a rails process can't handle another web request while it's handling the blocking call)
Part of the reason of using Sidekiq or Resque is the avoid controllers that do heavily lifting during the http request.
Instead, background jobs should report their status to the database. Then web server should query and return to the client the latest status from the database.
If clients need more immediate feedback, you can:
make clients constantly poll
post request to the client (if the API consumer is another webserver)
use another protocol mechanism (eg - websockets).

How to handle http calls in a delayed job?

I am implementing an json api using rails. I wish to make requests to another web service using delayed job to prevent it from blocking my rails app. So far so good. So i have a function defined in my model which does a http POST to this other web service.
However, the other web service is is an asynchronous api with callbacks. Hence I want to also receive callbacks from this api within my delayed job.
Is this possible? Can I have a http listener in my delayed job whose port number I can control or know within my code?

Rails application design: Queueing, Resque, Background Services, and Redis

I am designing a Rails app that takes in requests, uses data within the request to call a 3rd party web service, process the reply and then sends out a response to the original requestor and also issues a PUT request to yet another service.
I am trying to wrap my head around how to design this Rails app as it's different from the canonical Rails structure.
The objects are Lists and Tasks. Each List has many Tasks, and each Task belongs to a List.
The request I would get is something like:
http://myrailsapp.heroku.com/v1/lists?id=1&from=2012-02-12&to=2012-02-14&priority=high
In this example I am requesting tasks from 2/12/2012 to 2/14/2012 with a high priority in List #1
I would then issue a 3rd party web service call like this:
http://thirdpartywebservice.com/v1/lists?id=4128&from=2012-02-12&to=2012-02-14&priority=high
As you can see some processing was done on the data (id was changed in this case)
The results are then sent back to the requestor and to another web service via PUT.
My question is, how do I set up the Rails app to handle these types of behaviors? How does the controller structure change? This looks like a good use case for queues, how do I distribute multiple concurrent requests among queues?
For one thing I don't need data persistence (data can be discarded after the response is sent out) and also data structure design is simplified. (I don't think I need ruby objects, simply dictionaries or hashes representing these would be lighter weight and quicker to implement)
Edit
So I broke down the work flow of the app into these components
Parse incoming request
Construct 3rd part web service request
Send 3rd party request
Enqueue a worker to process the expected response
Process the response once it arrives
Send the parsed result back as a response
Which of the standard ruby controllers handle each of these steps? What are the models needed besides Lists and Tasks?
You should still use a database because passing data to Resque is messy. Rather, you should store it in the database and then pass the id to the workers, fetch the data, commit any new data or delete the record. It's really up to you but this method is cleaner. You can also use a push service like faye to let the user know when the processing is complete.
If you expect to have many concurrent requests, I would recommend Sidekiq as it's less of a memory hog. Having 4-5 resque workers can already suck up about 512 MB. The controller structure should not change. Please comment on anything you need clarified and I'll be happy to update my answer.
EDIT
You would want to use a separate database store, such as Postgres. Not sure if it's important what models you need, but essentially this is what should be happening.
In your controller, create a Request object which contains the query params you want to query this 3rd party service with. Then enqueue a job to be handled by Sidekiq/Resque, let's call this ThirdPartyRequest and pass in the id of the Request object you just created as an argument. Then render a view here showing the Request object. Let's say that Request#response is still empty cause it hasn't been processed yet, so let the user know it's still processing.
A worker then handles your job ThirdPartyRequest. ThirdPartyRequest should then fetch the Request object and obtain the query params needed to contact the third party service. It does that then gets a Request. Update the Request object with this Request then save it.
class ThirdPartyRequest
def self.perform(request_id)
request = Request.find(request_id)
# contact third party service
request.response = ...
request.save
end
end
The user can continually refresh his page to check on his/her Request object. Once it gets updated with the response, they will know its completed. If you want the page to refresh automatically, look into faye/juggernaut/private_pub or a SaaS solution like Pusher.

Resources