Web2py server problem - connection

I am running a web2py server which handles some requests which may take a total completion time of few seconds to few minutes. Once a connection is made to the server and it is processing a request which takes about 2-3 minutes, new connections to the server have to wait untill the former's request is completed.
I don't know if we can tweak some parameters in web2py for this. Do we have any way out of this problem.

web2py does not lock the server when busy with a connection but it does lock the user session, on purpose. That means other users can connect but not the one that started the original request. In the acton that takes time you can do:
session._unlock(response)
and this problem (if diagnosis is correct) will go away.
Anyway, it is not a good idea to have requests that take so long. The web server may kill your process and it is not good for usability. You should have a db table where you queue such tasks and handle them in a background process (explained in the manual) than use ajax or html5 websockets (web2y/gluon/contrib/comet_messaging.py) to check progress on the long running task.
Please bring this up on the web2py mailing list and we will help with more concrete examples.

Related

How do I spread out load on my rails app from webhook responses?

I have a rails app that easily handles the traffic we currently experience, except once a day when we receive a large number of pings within a few seconds from an external service's webhook that is reporting on past transactions. Currently this causes the app to time out due to lack of db connection availability, meaning we lose some of the webhooks as well as bringing the site down for a few seconds. It's not important that the data contained in these webhooks be processed instantaneously, so I am looking for a good way to spread out the responses, rather than do an expensive upgrade just to handle these bursts with additional db connection capability.
Is it okay to just have the relevant controller method sleep for a small, random number of seconds before doing anything that would open a db connection to spread things out? Or is there a better way to do this?
Setup a background/async processing system like Sidekiq (or whatever Heroku offers). Modify your controller action to do nothing but shove the parameters into a background job and return "ok". Then process the job in the background.

Using Puma and Sidekiq in a backend Rails app

I have a backend Rails server with Sidekiq, which serves as API server. The app works as follow:
My Rails server receives many requests from incoming API clients at the same time.
For each of these requests, the Rails server will allocate jobs to a Sidekiq server. Sidekiq server makes requests to external APIs (such as Facebook) to get data, and analyze it and return a result to Rails server.
For example, if I receive 10 incoming requests from my API clients, for each request, I need to make 10 requests to external API servers, get data and process it.
My challenge is to make my app responds to incoming requests concurrently. That is, for each incoming request, my app should process in parallel: make calls to external APIs, get data and return result.
Now, I know that Puma can add concurrency to Rails app, while Sidekiq is multi-threaded.
My question is: Do I really need Sidekiq if I already have Puma? What would be the benefit of using both Puma and Sidekiq?
In particular, with Puma, I just invoke my external API calls, data processing etc. from my Rails app, and they will automatically be concurrent.
Yes, you probably do want to use Puma and Sidekiq. There are really two issues at play here.
Concurrency (as it seems you already know) is the number of web requests that can be handled simultaneously. Using an app server like Puma or Unicorn will definitely help you get better concurrency than the default web brick server.
The other issue at play is the length of time that it takes your server to process a web request.
The reason that these two things are related is that number or requests per second that your app can process is a function of both the average processing time for each request and the number of worker processes that are accepting requests. Say your average response time is 100ms. Then a single web worker can process 10 requests per second. If you have 5 workers, then you can handle 50 requests per second. If your average response time is 500ms, then you can handle 2 reqs/sec with a single worker, and 10 reqs/sec with 5 workers.
Interacting with external APIs can be slow at times, and in the worst cases it can be very unreliable with unresponsive servers on the remote end, or network outages or slowdowns. Sidekiq is a great way to insulate your application (and your end users) from the possibility that the remote API is responding slowly. Imagine that the remote API is running slowly for some reason and that the average response time from it has slowed down to 2 seconds per request. In that case you'd only be able to handle 2.5 reqs/sec with 5 workers. With anymore traffic than that your end users might start to have a long wait time before any page on your app could respond, even those that don't make remote API calls, because all of your web workers might be waiting for the slow remote API to respond. As traffic continues to increase your users would start getting connection timeouts.
The idea with using Sidekiq is that you separate the time spent waiting on the external API from your web workers. You'd basically take the request for data from your user, pass it to Sidekiq, and then immediately return a response to the user that basically says "we're processing your request". Sidekiq can then pick up the job and make the external request. After it has the data it can save that data back into your application. Then you can use web sockets to push a notification to the user that the data is ready. Or even push the data directly to them and update the page accordingly. (You could also use polling to have the page continually asking "is it ready yet?", but that gets very inefficient very quickly.)
I hope this makes sense. Let me know if you have any questions.
Sidekiq, like Resque and Delayed Job, is designed to provide asynchronous job processing from a queue.
If you don't need jobs to be queued up and run asynchronously, there's no substantial benefit (or harm) to using Sidekiq.
If the tasks need to run synchronously (which it sounds like you might—it's not clear if clients are waiting for data or just requesting that jobs run), Sidekiq and its relatives are likely the wrong tool for the job. There is no guaranteed processing time when using Sidekiq or other solutions; jobs are pushed onto the end of the stack, however long that may be, and won't be processed until their turn comes up. If clients are waiting for data, they may time out long before your worker pool ever processes their jobs.

How to specify timeout for a particular request in ruby on rails?

How can I specify timeout of 2 minutes for a particular request in rails application. One of my application request is taking morethan 5 minutes in some cases. In that case I would like to stop processing that request if it is taking morethan 2 mins.
I need this configuration at application level so that in future if there are any other such type of requests I should not do any special changes otherthan mentioning that action in that configuration. There are some requests which take morethan 10mins also. But they should not have any effect.
Thanks,
Setting the timeout for a request back that far is generally bad practice. Making your users wait for minutes on end for a request to finish isn't a good idea.
Instead, this type of long-running task should be placed into a job queue for a worker process to run at it's convenience independent of the web request. This
allows the web request to finish very quickly, making your user happy
the long-running task to stay out of your web process, freeing it up to do what its supposed to (serve web requests)
Consider a gem like delayed_job. Describing how to work it into your application is outside of the scope of this question; my answer here serves only to point out that looking to modify the timeout is very likely the wrong 'answer' and than you're better off looking at a job queue.

Scaling Dynos with Heroku

I've currently got a ruby on rails app hosted on Heroku that I'm monitoring with New Relic. My app is somewhat laggy when using it, and my New Relic monitor shows me the following:
Given that majority of the time is spent in Request Queuing, does this mean my app would scale better if I used an extra worker dynos? Or is this something that I can fix by optimizing my code? Sorry if this is a silly question, but I'm a complete newbie, and appreciate all the help. Thanks!
== EDIT ==
Just wanted to make sure I was crystal clear on this before having to shell out additional moolah. So New Relic also gave me the following statistics on the browser side as you can see here:
This graph shows that majority of the time spent by the user is in waiting for the web application. Can I attribute this to the fact that my app is spending majority of its time in a requesting queue? In other words that the 1.3 second response time that the end user is experiencing is currently something that code optimization alone will do little to cut down? (Basically I'm asking if I have to spend money or not) Thanks!
Request Queueing basically means 'waiting for a web instance to be available to process a request'.
So the easiest and fastest way to gain some speed in response time would be to increase the number of web instances to allow your app to process more requests faster.
It might be posible to optimize your code to speed up each individual request to the point where your application can process more requests per minute -- which would pull requests off the queue faster and reduce the overall request queueing problem.
In time, it would still be a good idea to do everything you can to optimize the code anyway. But to begin with, add more workers and your request queueing issue will more than likely be reduced or disappear.
edit
with your additional information, in general I believe the story is still the same -- though nice work in getting to a deep understanding prior to spending the money.
When you have request queuing it's because requests are waiting for web instances to become available to service their request. Adding more web instances directly impacts this by making more instances available.
It's possible that you could optimize the app so well that you significantly reduce the time to process each request. If this happened, then it would reduce request queueing as well by making requests wait a shorter period of time to be serviced.
I'd recommend giving users more web instances for now to immediately address the queueing problem, then working on optimizing the code as much as you can (assuming it's your biggest priority). And regardless of how fast you get your app to respond, if your users grow you'll need to implement more web instances to keep up -- which by the way is a good problem since your users are growing too.
Best of luck!
I just want to throw this in, even though this particular question seems answered. I found this blog post from New Relic and the guys over at Engine Yard: Blog Post.
The tl;dr here is that Request Queuing in New Relic is not necessarily requests actually lining up in the queue and not being able to get processed. Due to how New Relic calculates this metric, it essentially reads a time stamp set in a header by nginx and subtracts it from Time.now when the New Relic method gets a hold of it. However, New Relic gets run after any of your code's before_filter hooks get called. So, if you have a bunch of computationally intensive or database intensive code being run in these before_filters, it's possible that what you're seeing is actually request latency, not queuing.
You can actually examine the queue to see what's in there. If you're using Passenger, this is really easy -- just type passenger status on the command line. This will show you a ton of information about each of your Passenger workers, including how many requests are sitting in the queue. If you run with preceded with watch, the command will execute every 2 seconds so you can see how the queue changes over time (so just execute watch passenger status).
For Unicorn servers, it's a little bit more difficult, but there's a ruby script you can run, available here. This script actually examines how many requests are sitting in the unicorn socket, waiting to be picked up by workers. Because it's examining the socket itself, you shouldn't run this command any more frequently than ~3 seconds or so. The example on GitHub uses 10.
If you see a high number of queued requests, then adding horizontal scaling (via more web workers on Heroku) is probably an appropriate measure. If, however, the queue is low, yet New Relic reports high request queuing, what you're actually seeing is request latency, and you should examine your before_filters, and either scope them to only those methods that absolutely need them, or work on optimizing the code those filters are executing.
I hope this helps anyone coming to this thread in the future!

Deferring blocking Rails requests

I found a question that explains how Play Framework's await() mechanism works in 1.2. Essentially if you need to do something that will block for a measurable amount of time (e.g. make a slow external http request), you can suspend your request and free up that worker to work on a different request while it blocks. I am guessing once your blocking operation is finished, your request gets rescheduled for continued processing. This is different than scheduling the work on a background processor and then having the browser poll for completion, I want to block the browser but not the worker process.
Regardless of whether or not my assumptions about Play are true to the letter, is there a technique for doing this in a Rails application? I guess one could consider this a form of long polling, but I didn't find much advice on that subject other than "use node".
I had a similar question about long requests that blocks workers to take other requests. It's a problem with all the web applications. Even Node.js may not be able to solve the problem of consuming too much time on a worker, or could simply run out of memory.
A web application I worked on has a web interface that sends request to Rails REST API, then the Rails controller has to request a Node REST API that runs heavy time consuming task to get some data back. A request from Rails to Node.js could take 2-3 minutes.
We are still trying to find different approaches, but maybe the following could work for you or you can adapt some of the ideas, I would love to get some feedbacks too:
Frontend make a request to Rails API with a generated identifier [A] within the same session. (this identifier helps to identify previous request from the same user session).
Rails API proxies the frontend request and the identifier [A] to the Node.js service
Node.js service add this job to a queue system(e.g. RabbitMQ, or Redis), the message contains the identifier [A]. (Here you should think about based on your own scenario, also assuming a system will consume the queue job and save the results)
If the same request send again, depending on the requirement, you can either kill the current job with the same identifier[A] and schedule/queue the lastest request, or ignore the latest request waiting for the first one to complete, or other decision fits your business requirement.
The Front-end can send interval REST request to check if the data processing with identifier [A] has completed or not, then these requests are lightweight and fast.
Once Node.js completes the job, you can either use the message subscription system or waiting for the next coming check status Request and return the result to the frontend.
You can also use a load balancer, e.g. Amazon load balancer, Haproxy. 37signals has a blog post and video about using Haproxy to off loading some long running requests that does not block shorter ones.
Github uses similar strategy to handle long requests for generating commits/contribution visualisation. They also set a limit of pulling time. If the time is too long, Github display a message saying it's too long and it has been cancelled.
YouTube has a nice message for longer queued tasks: "This is taking longer than expected. Your video has been queued and will be processed as soon as possible."
I think this is just one solution. You can also take a look EventMachine gem, that helps to improve the performance, handler parallel or async request.
Since this kind of problem may involve one or more services. Think about possibility of improving performance between those services(e.g. database, network, message protocol etc..), if caching may help, try out caching frequent requests, or pre-calculate results.

Resources