How to hunt down a long running request in Rails - ruby-on-rails

We have a customer complaining about a long running request. I found the request in the production.log but am not sure how to dig deeper into figuring out why it took so long. Is there any artifacts in the log that I should look for?
Also the DB and View times don't add up to the total request time.

Try newrelic rpm. It can parse your logs and show you the slowest requests and a lot of other information. It shows live statistics about the app too. The trial should be enough for you to fix your application.

Related

Cloudflare + Heroku (Rails) Failed Requests

Our Ruby on Rails application is running on Heroku behind Cloudflare. Occasionally, requests to our server simply don't make it. We learned about this through users emailing us saying that their data is not saving. I installed Bugsnag on the front-end to track how often this occurs. In short, it's a very small percentage of requests (<1%), but when it happens, it's extremely frustrating to our users.
The response from Cloudflare contains a status of "-1" and no other information. I'm confident these requests are not reaching our server, since I've setup logging there, too. Further, simply retrying the request works approximately 2/3 of the time but has negative consequences for us, too (duplicate data in our database).
Has anyone experienced this before? Any ideas where to look next? We've followed the Cloudflare <> Heroku setup guide closely. We're operating in Full (Strict) SSL. I've done a TON of reading on this. Any help would be GREATLY appreciated.

Time to First Byte for heroku server

Our Ruby on Rails website on heroku shows a Waiting (Time to First Byte) in chrome inspector of 1000ms+ on most requests for all pages.
Heroku's logs and New Relic both show total response times of under 200ms however. (This includes request Queuing)
The heroku app has two dynos and doesn't go into idle.
What could account for 800ms on average of missing time?
I believe this discrepancy is due to the time it takes to leave the server vs. the time it takes to be received by the client.
I sent a support ticket to heroku support who were great enough to look into it show the latency was occurring before the request arrived at the heroku server. The application makes use of CloudFlare which was contributing to the latency. CloudFlare had a blog write up on this at https://blog.cloudflare.com/ttfb-time-to-first-byte-considered-meaningles/ which goes into full detail.

Random slow Rack::MethodOverride#call on rails app on Heroku

Environment:
Ruby: 2.1.2
Rails: 4.1.4
Heroku
In our rails app hosted on Heroku, there are times that requests take a long time to execute. It is just 1% of times or less, but we cannot figure out what it is happening.
We have newrelic agent installed and it says that it is not request-queuing, it is the transaction itself who takes all that time to execute.
However, transaction trace shows this:
(this same request most of the times takes only 100ms to be executed)
As far as I can tell, the time is being consumed before our controller gets invoked. It is consumed on
Rack::MethodOverride#call
and that is what we cannot understand.
Also, most of the times (or even always, we are not sure) this happens on POST requests that are sent by mobile devices. Could this have something to do with a slow connection? (although POST-payload is very tiny).
Has anyone experienced this? Any advice on how to keep exploring this issue is appreciated.
Thanks in advance!
Since the Ruby agent began to instrument middleware in version 3.9.0.229, we've seen this question arise for some users. One possible cause of the longer timings is that Rack::MethodOverride needs to examine the request body on POST in order to determine whether the POST parameters contain a method override. It calls Rack::Request#POST, which ends up triggering a read that reads in the entire request body.
This may be why you see that more time than expected is being spent in this middleware. Looking more deeply into how the POST body relates to the time spent in the middleware might be a fruitful avenue for investigation.
In case anyone is experiencing this:
Finally we have made the switch from unicorn to passenger and this issue has been resolved:
https://github.com/phusion/passenger-ruby-heroku-demo
I am not sure, but the problem may have something to do with POST requests on slow clients. Passenger/nginx says:
Request/response buffering - The included Nginx buffers requests and
responses, thus protecting your app against slow clients (e.g. mobile
devices on mobile networks) and improving performance.
So this may be the reason.

Heroku. Request taking 100ms, intermittently Times out

After performing load testing against an app hosted on Heroku, I am finding that the most DB intensive request takes 50-200ms depending upon load. It never gets slower, no matter the load. However, seemingly at random, the request will outright timeout (30s or more).
On Heroku, why might a relatively high performing query/request work perfectly 8 times out of 10 and outright timeout 2 times out of 10 as load increases?
If this is starting to seem like a question for Heroku itself, I'm looking to first answer the question of whether "bad code" could somehow cause this issue -- or if it is clearly a problem on their end.
A bit more info:
Multiple Dynos
Cedar Stack
Dedicated Heroku DB (16 connections, 1.7 GB RAM, 1 comp. unit)
Rails 3.0.7
Thanks in advance.
Since you have multiple dynos and a dedicated DB instance and are paying hundreds of dollars a month for their service, you should ask Heroku
Edit: I should have added that when you check your logs, you can look for a line that says "routing" That is the Heroku routing layer that takes HTTP request and sends them to your app. You can add those up to see how much time is being spent outside your app. Unfortunately I don't know how easy it is to get large volumes of those logs for a load test.

Diagnosing Rails 3 Heroku Slowness

I have a Rails 3 app that I am running on Heroku. The app is usually really fast but sometimes I'll get cases where the app seems to hang for upwards of 2 minutes before finally returning the requested page.
I have the New Relic addon installed and there doesn't seem to be anything sticking out at me. It seems to be kind of sporadic and doesn't seem to be connected to a particular controller/action.
How would you suggest I go about pinpointing the cause of this problem?
http://github.com/kyledecot/skateparks-web
Always check the logs. When it happens, immediately go check your logs. Pretty sure all SQL queries are logged and timed, and you might want to add logging and timing to some of your own service calls.
If you upgrade to the Pro level of New Relic, you can get detailed traces specifically of your slow transactions. Turn up your Transaction Trace threshold to a large number (1s is pretty big), and wait for traces to show up. You'll see a detailed breakdown of the performance of an individual request, including SQL queries.
(Full disclosure: I work for New Relic.)

Resources