unexplained H18 errors in Rails App on Heroku - ruby-on-rails

I have a very strange problem with my rails app on Heroku that I haven't been able to solve for nearly a month now. It's a problem that only occurs on the production server and I can't replicate in development and for which the logs report no errors accept H18 errors.
Here's what is happening. The application runs fine for about 12 hours, then at a certain point the number of requests spike a bit and Heroku starts reporting regular H18 errors
At this point the application doesn't completely fail, but all requests that invoke some kind of open-uri requests fail (basically request to an external webservice), return a 500 error. Normal requests that simply display a static view still seem to continue to work.
The logs are not particularly helpful.
Nearly every H18 error is associated with a request for /robots.txt but at least one error is associated with an assets request: "/assets/application-38a4580edd72e30f34ea76583ab7e1b1b5654c72a6313ece935177d23b0398d3.css"
Below is an excerpt
Oct 10 21:09:32 lombardpress-web heroku/router: sock=backend at=error code=H18 desc="Server Request Interrupted" method=GET path="/robots.txt" host=scta.lombardpress.org request_id=e0e344d1-0349-4b0a-8db0-3c6e3ad3e99f fwd="157.55.39.188" dyno=web.1 connect=0ms service=209ms status=503 bytes= protocol=http
Oct 10 21:09:41 lombardpress-web heroku/router: sock=backend at=error code=H18 desc="Server Request Interrupted" method=GET path="/assets/application-38a4580edd72e30f34ea76583ab7e1b1b5654c72a6313ece935177d23b0398d3.css" host=scta.lombardpress.org request_id=fdd88ca2-140e-4051-9011-4d81ca218f19 fwd="157.55.39.206" dyno=web.1 connect=0ms service=252ms status=503 bytes= protocol=http
Oct 10 21:33:55 lombardpress-web heroku/router: sock=backend at=error code=H18 desc="Server Request Interrupted" method=GET path="/robots.txt" host=scta.lombardpress.org request_id=5aa463f5-43ff-4b74-a2de-e944aa9d2387 fwd="46.229.168.147" dyno=web.1 connect=0ms service=254ms status=503 bytes= protocol=http
Oct 10 21:38:23 lombardpress-web heroku/router: sock=backend at=error code=H18 desc="Server Request Interrupted" method=GET path="/robots.txt" host=scta.lombardpress.org request_id=2300949d-8998-4ed0-a4ab-bf1975e93cc6 fwd="216.244.66.197" dyno=web.1 connect=0ms service=220ms status=503 bytes= protocol=http
Once the errors start, I simply have to restart the app and everything works fine again for another 12 or so errors. Even requests to robots.txt. But after approximately 12 hours the same problem occurs.
Due to web crawlers, I would estimate the application is getting hit a with a request every 2-4 seconds. Otherwise it is not a particularly high traffic site.
Heroku provided only the smallest feedback
The backend socket, belonging to your app’s web process was closed before the backend returned an HTTP response. This is happening at the application or web server level so there is not much additional insight we can provide. You can see that the request spends some time in your application service=1118ms before returning a status 500. I wonder if a middleware (maybe you have one making external requests?) is failing before it hits your actual rails stack. I would suggest starting by looking there.
The code is open source and available here if anyone is interested in poking around: https://github.com/lombardpress/lombardpress-web
I'd be grateful for any thoughts or suggestions. I've been struggling with this for a while now and I'm not sure how to solve the problem.

Related

Exception catch for Heroku "Connection closed without response"

I am using heroku to run my application and set the timeout = 12s (using racktimeout), using unicorn, sometimes I am getting the H13 issue
2010-10-06T21:51:37-07:00 heroku[router]: at=error code=H13 desc="Connection closed without response"
method=GET path="/" host=myapp.herokuapp.com fwd=17.17.17.17 dyno=web.1 connect=12610ms service=15882ms
status=503 bytes=0
So this breaks in between the code execution, can we handle this in the exception anyway? Am I doing anything wrong here?
Set the timeout to 30 seconds. And examine if the problem will recur. This should help.
The documentation says:
When a Unicorn web server is configured with a timeout shorter than 30s and a request has not been processed by a worker before the timeout happens.
In this case, Unicorn closes the connection before any data is written, resulting in an H13.
https://devcenter.heroku.com/articles/error-codes#h13-connection-closed-without-response

R15 Issue on Heroku without exceeding memory limit

I have a Ruby on Rails site running on Heroku's performance-M dynamo, with autoscaling set up to 5 dynamos.
Recently, we have been receiving abrupt R15 and H12 errors on the site. During this, memory usage is shown well under the memory quota allowed for the dynamo.
Here are the errors shown in the log:
2019-09-16T10:12:08.523336+00:00 app[scheduler.2787]: Command :: identify -format '%wx%h,%[exif:orientation]' '/tmp/897302823996a945884a1d912c28d59520190916-4-1bn5w9k.jpg[0]' 2>/dev/null
2019-09-16T10:12:16.022212+00:00 heroku[scheduler.2787]: Process running mem=1022M(199.7%)
2019-09-16T10:12:16.022295+00:00 heroku[scheduler.2787]: Error R14 (Memory quota exceeded)
2019-09-16T10:12:16.365725+00:00 heroku[router]: at=info method=GET path="/favicon-16x16.png" host=www.site.com request_id=8755a947-ace9-471d-a192-a236785505b4 fwd="45.195.5.37" dyno=web.1 connect=1ms service=2ms status=200 bytes=928 protocol=https
2019-09-16T10:12:19.103405+00:00 heroku[scheduler.2787]: Process running mem=1279M(250.0%)
2019-09-16T10:12:19.103405+00:00 heroku[scheduler.2787]: Error R15 (Memory quota vastly exceeded)
2019-09-16T10:12:19.103405+00:00 heroku[scheduler.2787]: Stopping process with SIGKILL
2019-09-16T10:12:19.427029+00:00 heroku[scheduler.2787]: State changed from up to complete
2019-09-16T10:12:19.388039+00:00 heroku[scheduler.2787]: Process exited with status 137
2019-09-16T10:13:07.886016+00:00 heroku[router]: at=error code=H12 desc="Request timeout" method=GET path="/favicon.ico" host=www.site.com request_id=c7cea0a2-7345-44c6-926e-3ad5a0eb2066 fwd="45.195.5.37" dyno=web.2 connect=1ms service=30000ms status=503 bytes=0 protocol=https
As you can see, just before the R15 error, paperclip was trying to compress an image.
The beginning of the graphs in the following screenshots show the status of Heroku Metrics for the affected period:
Heroku Metrics Part 1
Heroku Metrics Part 2
Can anyone please help me figure out how the R15 error, which is related to memory leakage occurring while the metrics show the memory well in the limit? Any help regarding how to stop this situation from repeating will be helpful.
Thanks.
Your R15 error occurred on a one-off dyno created by Heroku Scheduler, completely separate from your web dynos. Your request timeouts appear to be unrelated to the memory issues in your scheduled task.
The scheduled task appears to be running on a 1X dyno (mem=1022M(199.7%)). To change this, launch the Heroku Scheduler add-on and change the dyno type.
For your request timeouts, check out Scout or New Relic to find the problematic endpoint and where in the stack is taking so long.

heroku router - - at=error code=H12 - Site going down for 30s every day or two

Heroku Rails site going down every day or two for 30 seconds max, alerts sent by uptime-robot.
Have tried to run some basic load testing, and doesn't seem take the site down when under solid traffic. Not running any expensive queries on the homepage.
Error logs look like the below. Running rails on hobby dev.
Not getting any errors through New Relic. Running Puma 'puma', '~> 2.15.3'.
Have this set on rack timeout initailizer: Rack::Timeout.timeout = 15
Dec 07 20:14:46 sleepy-wave-3748 heroku/router: at=error code=H12 desc="Request timeout" method=GET path="/" host=mySiteUrlHere request_id=274ed877-2edb-43f2-8b77-c7a82f17109a fwd="69.162.124.231,108.162.221.131" dyno=web.1 connect=1ms service=30007ms status=503 bytes=0
Dec 07 20:35:46 sleepy-wave-3748 heroku/router: at=error code=H12 desc="Request timeout" method=HEAD path="/" host=mySiteUrlHere request_id=1b43b249-f089-4a4a-a42b-b2886d607fa8 fwd="69.162.124.231,108.162.221.131" dyno=web.1 connect=0ms service=30003ms status=503 bytes=0
Any Suggestions ?
Heroku will typically disconnect after 30 seconds and return a 503 error. It appears that your service may be responding successfully but after 30 seconds have passed. This could explain why you are not seeing application errors but are seeing Heroku errors.
EDIT: I think I may have misread your question. Heroku goes through a cycling process on some of its dynos. During this time, there may be a possibility for it to appear that your application has gone down. If you are using free dynos, they require a period of downtime (per rolling 24hr period)
I think I have solved it. I found a pull request in the Puma github account relating to this issue and noticed the fix had been included in a later version of Puma. After upgrading my Puma version, the problem has stopped happening. Hopefully this can help someone else out.
Note, I upgraded to Puma 2.15.3 from 2.11.2 .

Heroku server - Rails View errors but the code runs fine

I have a snippet of code that takes quite a while to run - About two minutes. Locally, when this code is initiated through a button in my view, the page just appears to be loading until the two minutes is up.
When I deployed this to heroku, the code was erroring out. I changed my Unicorn timeout time to allow for the code to finish, and now when looking at my logs the code is completing just fine.
However, I am presented with this error in my heroku app's view:
An error occurred in the application and your page could not be served. Please try again in a few moments.
If you are the application owner, check your logs for details.
How can I get the same behavior I have locally on my heroku app? I would prefer that the page just continue to appear as if it were loading until the code is complete.
EDIT
I'm seeing this in my logs:
2014-01-14T17:11:49.388634+00:00 heroku[router]: at=error code=H12 desc="Request timeout"
method=POST path=/fluidsurveys/test_send_file host=glacial-spire-1431.herokuapp.com
fwd="66.195.31.22" dyno=web.1 connect=6ms service=30090ms status=503 bytes=0

What does this error from Heroku mean and how can I fix it?

When I try to open an application I'm working on through heroku, I get an application error. I went into my heroku logs, and found the following error:
"Error H10 (App crashed) -> GET gentle-samurai-8665.herokuapp.com/ dyno= queue= wait= service= status=503 bytes=
2012-07-09T17:39:09+00:00 heroku[router]: Error H10 (App crashed) -> GET gentle-samurai-8665.herokuapp.com/favicon.ico dyno= queue= wait= service= status=503 bytes="
I'm not sure what the error refers to or how I can go about fixing the problem that causes the error. Any help you can give would be great!
Heroku has a complete list of all its error codes. "A crashed web process or a boot timeout on the web process will present" an H10 error. There should be additional lines in your logs from your application that give more details.

Resources