Respond multiple times to one request? - ruby-on-rails

One place in my rails app requires loading a number of responses from an external server, which currently looks like this:
User makes an AJAX request to the server. "Loading data..." is displayed.
5-30 seconds later, the rails app sends response (assuming the data has not been cached).
It would be much better if I could keep the user informed during that long waiting period with messages informing them of the progress of the request. Such as:
User makes request (as before).
Message "Retrieving ABC" displayed
Message "Retrieving XYZ" displayed
Message "Processing data" displayed
Full response as normal.
How can I go about doing this? I don't think that sending back multiple JavaScript responses to one request is possible, but have no idea what the correct way of doing this is!

This is tricky but Rails supports the notion of streaming a request.
But you probably have to do a lot of work in your project to make this work.
Tenderlove (Aaron Patterson) posted a intro into how Streaming works in Rails and I believe there is a Railscast on this topic.
Probably a simpler solution would be to split this into multiple requests.
So the main request (assuming it's an ajax request) takes forever to complete.
Meanwhile you poll the status on a different ajax request and the main action updates the database with it's process so the other request can retrieve that status and send back the appropriate response (where in the process the main request currently is)
So I'd assign each request something like a request id and then have a database table for those requests and their statuses (could be as simple as having only id:integer and status:string)
You assign the request id on the client (use some random data to create a hash or something) and start the long request with that Id.
The client then polls another endpoint with that same id to get the status back.
The long running request in the meantime updates the Status table with the id it was given and where it is currently in processing that request.

Related

Rails API, microservices, async/deferred responses

I have a Rails API which can handle requests from the clients. Clients use that API to perform analysis of their data. Client POSTs the data to API, API checks if that data have been analysed before. If so API just respond with analysis result. If the data haven't been analyzed before API:
Tells client that analysis started.
Establishes the connection with analyzing microservice.
Performs asynchronous (or deferred or i don't know) request to the analyzing microservice and waiting for response. The analysis takes much time so neither the API nor the microservice should be blocked while doing it.
When the response from analyzing microservice is returned API hands it to the client.
The main issue for me is to set up things such way that client could receive somehow the message "Your data had been sent to analysis" right after he performed the request. And then when analysis will be done client could receive its result.
The question is what approach I have to use in that case? Async responses, deferred responses, something else? And what known solutions could help me with that? Any gems?
I'm new to that stuff so I'm really sorry if I ask dumb questions.
If using HTTP you can only have one response to every request. To send multiple responses, i.e. "work in progress", then later the "results", you would need to use a different protocol, e.g. web sockets.
Since HTTP is so very common I'd stick with that in combination with background jobs. There are a couple of options which spring to mind.
Polling: The API kicks off a background jobs (to call the microservice) and responds to the client with a URL which the client can ping periodically for the result. The URL would respond with some kind of "work in progress" status until the result is actually ready). The URL would need to include some kind of id so the API can lookup the background job.
The API would potentially have two URLS; /api/jobs/new and /api/jobs/<ID>. They would, in Rails, map to a controller new and show action.
Webhooks: Have the client include a URL of its own in the request. Once the result is available have the background job hit the given URL with the result.
Either way, if using HTTP, you will not be able to handle the whole thing within a request/response, you will have to use some kind of background processing (so request to the microservice happens in a different process). You could look at Sidekiq, for example.
Here is an example for polling:
URL: example.com/api/jobs/new
web app receives client request
generates a unique id for the request, SecureRandom.uuid.
starts a background job (Sidekiq) passing in the uuid and any other parameters needed
respond with URL such as example.com/api/jobs/
--
background job
sends request to microservice API and waits for response
saves result to database with uuid
--
URL: example.com/api/jobs/UUID
look in database for UUID, if not found respond that job is "in progress". If found return result found in database.
Depending on what kind of API you use. I assume your clients interact via HTTP.
If you want to build an asynchronous API over HTTP the first thing that you should do: accept the request, create a job, handle it in the background and immediately return.
For the client to get the response you have to 2 options:
Implement a status endpoint where clients can periodically poll the status of the job
Implement a callback via webhooks. So the client has to provide a URL which you then call after you're done.
A good start for background processing is the sidekiq gem or more general ActiveJob that ships with Rails.

Rails - ensuring only one simultaneous controller hit per user

tl;dr - Is there a way to ensure that a given Rails controller action stops executing when another simultaneous request from the same user comes in?
In my Rails/Angular app, I make requests to the Foursquare API from the client-side. Because they need to be authenticated, and my authentication information should remain secure, I pass these requests through a Rails controller in my own app.
For a more in-depth description of the architecture of this, check out this semi-related question.
My concern, as elaborated there, is that each request to this internal controller takes up server time (and on Heroku, ties up a dyno). I'd tried to make the action as fast as possible, but I'd still like to reduce the amount the server is tied up.
The amount the server is tied up is exacerbated by the real-time nature of the search I'm doing. The request is sent out to my server as a user types, not on enter or anything, because I wanted to allow for auto-suggestion.
I'm debouncing the user input (0.4 seconds), so a the request isn't made til a user briefly stops typing. But if a user pauses a few times while typing, and a request goes out each time, this can quickly cause multiple dynos to get used.
More concretely, assuming a roughly ~1.3s API response time from Foursquare, imagine this scenario: A user types "ameri", then waits 0.4 seconds, then types "can", then waits 0.4 seconds, then types " beauty", completing their query. This would send three separate requests, all of which would need to be handled by different dynos, because none of the requests have a chance to return before the next comes in.
This would either cost me a ton of money (if I have a bunch of users, that means a large number of dynos to protect against concurrency timeouts) or cause really annoying waits on the user front.
So my thought would be that it would be awesome if I could essentially do a retroactive debounce on the server side, by terminating any running requests to Foursquare coming from that user before sending a new request out. That would mean that in the above concrete example, while 3 requests started, only the last request would come back, because the first two would be dropped midstream when a new one came in.
I was thinking of storing some variable in session for each that would be true when a request was executing. Then, the next request wouldn't go out if it was triggered. But that's actually sort of the opposite of what I want, because I want the original request to get canceled when the new one comes in. I just don't know how to access that request from within the latter on.
This feels complicated, so I'm guessing it may be impossible (particularly as each controller action is responded to by a new controller instance), but does anyone know a way to cancel controller actions if the same action is hit by the same user again while the first request is getting resolved?
Thanks!

Long processing; way to periodically send a 102 Processing response?

I have a Rails app that can take a long time to prepare its response to some queries. (Mostly the delay is rendering the dataset into JSON or YAML.) The app sits behind a proxy whose configuration I cannot alter, with the result that these long-running queries tend to get terminated by the proxy as timeouts. Chunking doesn't help because there's nothing to chunk until the render is fully complete.
Is there any supported or already existing way in Rails to set up an asynchronous repeating task that could send back 102 Processing responses to keep the proxy happy until the complete response is ready?
I would really prefer not to have to implement pagination semantics.
I have control over the app and the client; both bits are my code. I don't have control over the proxy, nor the app's server.
Any suggestions are really welcome!
I would likely solve the problem by POSTing the initial request and having the rails app return the appropriate HTTP status code. Then I'd have javascript on the client side that would poll the server at reasonable intervals for the status of the render. The status action could return the 102 response until the processing is complete. Then you could insert a link into the page with the javascript that the user could click to download the finished file.

Rails application design: Queueing, Resque, Background Services, and Redis

I am designing a Rails app that takes in requests, uses data within the request to call a 3rd party web service, process the reply and then sends out a response to the original requestor and also issues a PUT request to yet another service.
I am trying to wrap my head around how to design this Rails app as it's different from the canonical Rails structure.
The objects are Lists and Tasks. Each List has many Tasks, and each Task belongs to a List.
The request I would get is something like:
http://myrailsapp.heroku.com/v1/lists?id=1&from=2012-02-12&to=2012-02-14&priority=high
In this example I am requesting tasks from 2/12/2012 to 2/14/2012 with a high priority in List #1
I would then issue a 3rd party web service call like this:
http://thirdpartywebservice.com/v1/lists?id=4128&from=2012-02-12&to=2012-02-14&priority=high
As you can see some processing was done on the data (id was changed in this case)
The results are then sent back to the requestor and to another web service via PUT.
My question is, how do I set up the Rails app to handle these types of behaviors? How does the controller structure change? This looks like a good use case for queues, how do I distribute multiple concurrent requests among queues?
For one thing I don't need data persistence (data can be discarded after the response is sent out) and also data structure design is simplified. (I don't think I need ruby objects, simply dictionaries or hashes representing these would be lighter weight and quicker to implement)
Edit
So I broke down the work flow of the app into these components
Parse incoming request
Construct 3rd part web service request
Send 3rd party request
Enqueue a worker to process the expected response
Process the response once it arrives
Send the parsed result back as a response
Which of the standard ruby controllers handle each of these steps? What are the models needed besides Lists and Tasks?
You should still use a database because passing data to Resque is messy. Rather, you should store it in the database and then pass the id to the workers, fetch the data, commit any new data or delete the record. It's really up to you but this method is cleaner. You can also use a push service like faye to let the user know when the processing is complete.
If you expect to have many concurrent requests, I would recommend Sidekiq as it's less of a memory hog. Having 4-5 resque workers can already suck up about 512 MB. The controller structure should not change. Please comment on anything you need clarified and I'll be happy to update my answer.
EDIT
You would want to use a separate database store, such as Postgres. Not sure if it's important what models you need, but essentially this is what should be happening.
In your controller, create a Request object which contains the query params you want to query this 3rd party service with. Then enqueue a job to be handled by Sidekiq/Resque, let's call this ThirdPartyRequest and pass in the id of the Request object you just created as an argument. Then render a view here showing the Request object. Let's say that Request#response is still empty cause it hasn't been processed yet, so let the user know it's still processing.
A worker then handles your job ThirdPartyRequest. ThirdPartyRequest should then fetch the Request object and obtain the query params needed to contact the third party service. It does that then gets a Request. Update the Request object with this Request then save it.
class ThirdPartyRequest
def self.perform(request_id)
request = Request.find(request_id)
# contact third party service
request.response = ...
request.save
end
end
The user can continually refresh his page to check on his/her Request object. Once it gets updated with the response, they will know its completed. If you want the page to refresh automatically, look into faye/juggernaut/private_pub or a SaaS solution like Pusher.

HTTP GET more efficient that POST for web service?

I have been told that a POST in some way does a double send to the server but GET does not. It sounds a bit crazy to me though.
Basically I'm working on a web project where each client calls a web service every 2 seconds from many countries and possible bad internet connections. So we want to make the calls and responses as tiny as possible between JavaScript and ASP.Net.
Security is not a problem and basically the poll is just returning data. Login is required to use it anyway.
I have been told that a POST in some way does a double send to the server but GET dose not. It sounds a bit crazy to me though.
You have been told wrong. The only difference is that POST allows for sending larger amount of data to the server and of course the more data you send the slower it will be. But if you send the same amount of data there won't be any difference in terms of performance between a GET and POST request.
One important thing to note as well is that if you are calling this service from javascript GET requests might be cached by the client browser. So for example if you are calling the same url over and over again using an AJAX GET request you might get cached values and the server never hit. To workaround this issue you could append a random number in the query string which has no meaning for the server but which changes the url and avoids it being cached.
When sending thru ajax post, some developers may have inited post on form submit and a submit button click. Later when they press the send button, both actions get fired. This might be the experience that people who have told you double sending thing experienced.
Note: This double sending of POST is totally a developer's fault. HTTP POST method has nothing to do with it.

Resources