I discovered today Servlet 3.0 asynchronous facility. I have read about it and think I understood the concept.
I was wondering: would that make any difference on "standard" controller's actions, or should it be saved for the use of web services, or extensive computational processes ?
In other words, is it a bad idea to use it on all one's controller's actions, without considering the computational time of the actions method beforehand?
If it is, could you explained to me why ?
Thank you in advance.
No, this would be a bad idea.
On a controller action, you get a request and you want to serve a response as soon as possible. You can use the asynchronous only for thing that can be delayed.
If a user is requesting a page on your website, you can't respond with empty page, then do a push back to update his page. I would use this feature only for AJAX requests and even not for all of them. You have to decide what makes sense to run be run asynchronously and what not.
You should read the Grails documentation for Asynchronous Request Handling
In general for controller actions that execute quickly there is little benefit in handling requests asynchronously. However, for long running controller actions it is extremely beneficial.
The reason being that with an asynchronous / non-blocking response, the one thread == one request == one response relationship is broken. The container can keep a client response open and active, and at the same time return the thread back to the container to deal with another request, improving scalability.
Hopefully this should be clear enough, but please ask if something is not clear.
Related
I have a very weird situation: I have a system where a client app (Client) makes an HTTP GET call to my Rails server, and that controller does some handling and then needs to make a separate call to the Client via a different pathway (i.e. it actually goes via Rabbit to a proxy and the proxy calls the Client). I can't change the pathway for that different call and I can't change the Client at all (it's a 3rd party system).
However: the issue is: the call via the different pathway fails UNLESS the HTTP GET from the client is completed.
So I'm trying to figure out: is there a way to have Rails finish the HTTP GET response and then make this additional call?
I've tried:
1) after_filter: this doesn't work because the after filter is apparently still within the Request/Response cycle so the TCP/HTTP response back to the Client hasn't completed.
2) enqueuing a worker: this works, but it is not ideal because if the workers are backed up, this call back to the client may not happen right away and it really does need to happen right after the Client calls the Rails app
3) starting a separate thread: this may work, but it makes me nervous: adding threading explicitly in Rails could be fraught with peril.
I welcome any ideas/suggestions.
Again, in short, the goal is: process the HTTP GET call to the Rails app and return a 200 OK back to the Client, completely finishing the HTTP request/response cycle and then call some extra code
I can provide any further details if that would help. I've found both #1 and #2 as recommended options but neither of them are quite what I need.
Ideally, there would be some "after_response" callback in Rails that allows some code to run but after the full request/response cycle is done.
Possibly use an around filter? Around filters allow us to define methods that wrap around every action that rails calls. So if I had an around filter for the above controller, I could control the execution of every action, execute code before calling the action, and after calling it, and also completely skip calling the action under certain circumstances if I wanted to.
So what I ended up doing was using a gem that I had long ago helped with: Spawnling
It turns out that this works well, although it required a tweak to get it working with Rails 3.2. It allows me to spawn a thread to do the extra, out-of-band callback to the Client, but let the normal, controller process complete. And I don't have to worry about thread management, or AR connection management. Spawnling handles that.
It's still not ideal, but pretty close. And it's slightly better than enqueuing a Resque/Sidekiq worker as there's no risk of worker backlog causing an unexpected delay.
I still wish there was an "after_response_sent" callback or something, but I guess this is too unusual a request.
I have uploading photos with ajax, manipulations and uploading to s3 take a lot of time. I heard that it's better to complete that tasks on background. My app need to wait while photos become uploaded. But if I choose background way then I will need to work with websockets or repeat ajax to check result(links to s3) ( I'm not happy about this).
Why is it too bad to make hard calculations right in controller (foreground)?
Now I use Torquebox(Jruby) and as I understand it has perfect concurrency. Does it mean that waiting uploading to s3 will not take resources and all will work fine?
Please write about pros and cons of back/fore ground in my situation. Thank you!
It is generally considered bad practice to block a web request handler on a network request to a third party service. If that service should become slow or unavailable, this can clog up all your web processes, regardless of what ruby you are using. This is what you are referring to as 'foreground.'
Essentially this is the flow of your current setup (in foreground):
a user uploads an image on your site and your desired controller receives the request.
Your controller makes a synchronous request to s3. This is a blocking request.
Your controller waits
Your controller waits
Your controller (continues) to wait
finally, (and this is not guaranteed) you receive a response from s3 and your code continues and renders your given view/json/text/etc.
Clearly steps 3-5 are very bad news for your server, and as I stated earlier, this worker/thread/process (Depending on your ruby/rails server framework) will be 'held up' until the response from s3 is received (which potentially could never happen).
Here is the same flow with a background job with some javascript help on the front-end for notification:
a user uploads an image on your site and your desired controller receives the request.
Your controller creates a new thread/process to make the request to s3. This is a non-blocking approach. You set a flag on a record that references your s3 image src, for example completed: false and your code continues nicely to step 3. Your new thread/process will be the one waiting for a response from s3 now, and you will set the 'completed' flag to true when s3 responds.
You render your view/json/text/etc, and inherently release your worker/thread/process for this request...good news!
now for the fun front end stuff:
your client receives your response, triggering your front-end javascript to start a setInterval-like repetitive function that 'pings' your server every 3-ish seconds, where your back-end controller checks to see if the 'completed' flag that you set earlier is true, and if so, respond/render true.
your client side javascript receives your response and either continues to ping (until you designate that it should give up) or stop pinging because your app responded true.
I hope this sets you on the right path. I figured writing code for this answer was inferior because it seemed like you were looking for pros and cons. For actual implementation ideas, I would look into the following:
the sidekiq is excellent for solving the background job issues described here. It will handle creating the new process where you can make the request to s3.
here is an excellent railscast that will help you get a better understanding of the code.
I am trying to code an API which has a long running process to which an end user may make a POST request:
POST /things { "some":"json" }
The actual creation process can take some time, will often be queued. It might take many minutes. As a result I am not sure what I should be returning or when. Is it the usual 201 plus object, returned after whatever time it takes my API to create the object? Isn't this going to cause problems at the client end? Is there some other standard way to do this - such as an intermediate step?
I'm using Rails & Grape for my API if that helps.
Consider whether the Post-Redirect-Get pattern suits your needs. For example, you can return a 303 redirect to some sort of status page where the client can check the progress of the request. In general, 201+object is a poor choice if the client has to wait for any appreciable period, because too many things can go wrong (what if out of annoyance or impatience he kills the browser window, or refreshes, or resubmits?)
I noticed that in a standard grails environment, a request is always executed to the end, even when the client connection is lost and the result can't be delivered anymore.
Is there a way to configure the environment in such a way that execution of a request is canceled as soon as the client connection is lost?
Update: Thanx fo the answers. Yes - most of the problems I am trying to avoid can be avoided by better coding:
caching can make nearly every page fast
a token can help to avoid submitting something twice
but there are some requests which still could consume some time. Let's take a map service as example. Calculating a route will take some time. One solution to avoid resubmitting the request could be a "calculationInProgress" flag together with a message to the user. But then it is still possible to create a lot of sessions and thus a lot of requests in order to do a DOS attack...
I am still curious: is there no way to configure the server to cancel the request? I used to develop on a system where the server behaved this way and it was great :-)
Probably there is no such way. And I'm sure grails (and your webcontainer) is designed to
accept incoming request
process it on server side
send response
if something happened during phase 2, i'll know about it only on send response phase. Actually you can send data to HttpSerlvetRespone by yourself, handle IOException, etc - but it will be too much low-level way, I think. And it will not help you with canceling your DB operations, while you're preparing data to send.
Btw, it's common pattern to use an web frontend, like nginx, that accepts incomming request and and handle all this problems with cancelled requests, slow requests (i guess it's the real problem?), etc.
According to your comment it is reload and multiple clicks that you are trying to avoid. The proper technique should be to use Grails support for handling multiple form submissions:
http://grails.org/doc/2.0.x/guide/theWebLayer.html#formtokens
I will build an Asp.net MVC 3 web page.
View: The view (web page) invoke about five Ajax(jQuery) calls against the methods, which return JsonResult, in a controller and render the results on the web page.
Control: The controller methods read a SQL Server 2008 database using EF4. Two of the SQL statements may take half a minute to execute depending on the server load.
I wish the users can at least see the contents returned from the quick controller/database calls as soon as possible. The page will not have a lot of users (maybe up to 15). Will the long run controller method calls block others if they are not asynchronous? Or is it irrelevant as long as the thread pool is big enough to handle the peak requests of the users?
From the user's view, loading the initial web page is synchronous, i.e. he has to wait until the server delivers the page. The Ajax requests however look asynchronous to him because he can already see part of the page.
From the server's view, everything is synchronous. There is an HTTP request that needs to be processed and the answer is either HTML, JSON or whatever. The client will wait until it receives the answer. And several requests can be processed in parallel.
So unless you implement some special locking (either on the web server or in the database) that blocks some of the requests, nothing will be blocked.
The proposed approach seems just fine to me.
Update:
There's one thing I forgot: ASP.NET contains a locking mechanism to synchronize access to the session data that can get into the way if you have several concurrent requests from the same user. Have a look at the SessionState attribute for a way to work around that problem.
Update 2:
And for an asynchronous behavior from the user's point of view, there's no need to use the AsyncController class. They where built for something else, which is not relevant in your case since only have 15 users.
Will the long run controller method calls block others if they are not asynchronous?
The first important thing to note is that all those controller actions should not have write access to the Session. If they write to the session whether they are sync or async they will always execute sequentially and never in parallel. That's due to the fact that ASP.NET Session is not thread safe and if multiple requests from the same session arrive they will be queued. If you are only reading from the Session it is OK.
Now, the slow controller actions will not block the fast controller actions. No matter whether they are synchronous or not. For long controller actions it could make sense to make them asynchronous only if you are using the asynchronous ADO.NET methods to access the database and thus benefit from the I/O Completion Ports. They allow you to not consume any threads during I/O operations such as database access. If you use standard blocking calls to the database you get no benefit from async actions.
I would recommend you the following article for getting deeper understanding of when asynchronous actions could be beneficial.