Heroku add-ons 'Logentries' & 'FlyData' query - ruby-on-rails

I have a Ruby on Rails web app hosted on Heroku and I've setup Logentries add-on which sets up alarms for 'High Response Time'.
Lately, I have started getting emails for 'ALERT High Response Time', which mention that the high response time was triggered for
heroku router - - at=info method=GET path="/robots.txt"
Now, I know that Search Engines like Google, Microsoft use the robots.txt to ignore the pages that should not be indexed. Is there any other reason, why this file would be accessed?
Please correct me if I am missing something here.
Oh, and I am using the free version of Heroku i.e. 1 worker for website-content and I have 1 worker which runs periodic jobs using the Scheduler.
Query #2-
What's wrong with my application, when I get the following email from Logentries, with subject - 'ALERT Exit Timeout'
Exit timeout: Heroku/my-app
2014-10-13 18:53:56.351
188 <45>1 2014-10-13T18:53:56.053533+00:00 heroku web.1 - - Error R12 (Exit timeout) -> At least one process failed to exit within 10 seconds of SIGTERM
Query #3-
I also installed the FlyData add-on trial to see how it works. I get emails with the subject - '[FlyData-Alert] (myapp) Application Error notification'.
The email says-
We noticed the following error logs on your application (myapp) :
2014-10-08T23:59:53.042662+00:00 app[scheduler.3266]: ** [NewRelic][10/08/14 23:59:53 +0000 21fd815f-5e08-42ab-80d8-4771ea1593c7 (2)] INFO : Installing Rails3 Error instrumentation
I think this email is triggered because of the INFO message from New Relic, which says - Installing Rails3 Error instrumentation. The FlyData add-on probably looks at the keyword 'Error' and triggers the email alert.

For Query #2: Heroku - Exit timeout: Heroku/my-app
According to Heroku's documentation,
"A process failed to exit within 10 seconds of being sent a SIGTERM indicating that it should stop. The process is sent SIGKILL to force an exit."
There is a complete list of Heroku Errors codes, including this one, that can be found here: https://devcenter.heroku.com/articles/error-codes#r12-exit-timeout
If you're using webrick to run your application on Heroku, you should try to switch to using 'thin' to see if that helps: See https://devcenter.heroku.com/articles/rails3#webserver.
or see the previous answer on stackoverflow here:
Rails app hosted on heroku: Error R12 (Exit timeout)
Hope this helps.
Michael

Related

Spree - Timeout when trying to access backend

Since this afternoon, access to the backend of my spree shop is suddenly unavailable. When I try to visit any page in the backend (/admin/users, /admin/orders, etc) the page just loads for a long time, until it times out and I get the generic error page.
When I look into the logs I always see either :
Processing by Spree::Admin::OrdersController#index as HTML
Completed 500 Internal Server Error in 127259ms
** [Airbrake] Success: Net::HTTPOK
Errno::ETIMEDOUT (Connection timed out - connect(2)):
app/middleware/flash_session_cookie_middleware.rb:18:in `call'
or
Processing by Spree::Admin::OrdersController#index as HTML
Completed 500 Internal Server Error in 127520ms
** [Airbrake] Success: Net::HTTPOK
SocketError (getaddrinfo: Name or service not known):
app/middleware/flash_session_cookie_middleware.rb:17:in `call'
This started happening after the last deployment to production, which only changed images and stylesheets. I can't reproduce the error locally, despite having the same code and an exact copy of the production database.
I'm using Spree version 2.0.3
Run Spree::Config[:check_for_spree_alerts] = false in your console to fix this. You may also want to add this line to your initializers/spree.rb to ensure check_for_spree_alerts is not re-enabled in the future.
This is happening because the Spree Alerts website has been discontinued. See: https://github.com/spree/spree/pull/6516
To be specific, this is happening because when logging into the backend, Spree 2.0.x checks for any alerts from the Spree website here https://github.com/spree/spree/blob/2-0-stable/backend/app/controllers/spree/admin/base_controller.rb#L39 which then calls alert.rb:14 :
HTTParty.get('http://alerts.spreecommerce.com/alerts.json', query: params).parsed_response
At the moment, alerts.spreecommerce.com has been discontinued and is timing out, which explains the errors you're receiving.
The Spree Alerts code has been removed as of Spree 2.3, so you could also upgrade to that version to resolve this issue.

Heroku Error H13

I've been getting this error now on & off for the past couple days since I deployed my application to heroku. It happens both before I started using unicorn as a server as well as afterwards. I can sometimes get it back up and running by using heroku run rake db:migrate then heroku restart but this only fixes it for a couple hours and it's broken again. As for the webpage it's saying "Application error". The logs aren't very helpful but here's what it says each time this error happens:
[2014-10-27T21:13:31.675956 #2] ERROR -- : worker=1 PID:8 timeout (16s > 15s), killing
[2014-10-27T21:13:31.731646 #14] INFO -- : worker=1 ready
[2014-10-27T21:13:31.694690 #2] ERROR -- : reaped #<Process::Status: pid 8 SIGKILL (signal 9)> worker=1
at=error code=H13 desc="Connection closed without response" method=GET
I'm just using the free version of heroku, I want to make sure it works before upgrading but is that my only option at this point?
Also I am able to run this locally perfectly fine using either rails server or foreman start.
Heroku docs say this about H13:
H13 - Connection closed without response
This error is thrown when a process in your web dyno accepts a connection, but then closes the socket without writing anything to it.
One example where this might happen is when a Unicorn web server is configured with a timeout shorter than 30s and a request has not been processed by a worker before the timeout happens. In this case, Unicorn closes the connection before any data is written, resulting in an H13.
A couple lines up, you have an error about a process timing out after 15s:
ERROR -- : worker=1 PID:8 timeout (16s > 15s), killing
Heroku help has a section on timeout settings:
Depending on your language you may be able to set a timeout on the app server level. One example is Ruby’s Unicorn. In Unicorn you can set a timeout in config/unicorn.rb like this:
timeout 15
The timer will begin once Unicorn starts processing the request, if 15 seconds pass, then the master process will send a SIGKILL to the worker but no exception will be raised.
That matches the error messages in your log. I'd look into it.

How to findout what cause unicorn workers timeout

People keeps claiming that my website always hang out at some pages. I checked the unicorn stderr log, and found many timeout errors like:
E, [2013-08-14T09:27:32.236478 #30027] ERROR -- : worker=5 PID:11619 timeout (601s > 600s), killing
E, [2013-08-14T09:27:32.252252 #30027] ERROR -- : reaped #<Process::Status: pid=11619,signaled(SIGKILL=9)> worker=5
I, [2013-08-14T09:27:32.266141 #4720] INFO -- : worker=5 ready
There are many error messages like that.
Then I go to the rails production log, find the exact requests by searching the unicorn error time minus 601s. These timeout request, all choked at the page rendering phase. The sql of these requests are done already. It just never gets an end:
Processing by XXXController#index as HTML
Rendered xxx/index.html.erb within layouts/application (41.4ms)
Rendered shared/_sidebar.html.erb (200.9ms)
No complete. Most of these requests served successfully. I don't know why at random time, it hang out there.
I have no idea what may cause this. Can anybody give me a clue of how to find the real reason that cause the unicorn workers to timeout?
Update:
We used NSC to transfer request and response to unicorn. And to try to improve the timeout issue, we added nginx between NSC and unicorn. It turns out the unicorn worker timeout still happens, and each timeout matches a nginx upstream timeout in nginx error log.
Does anyone knows whether there is some kind of bottle neck in TCP connection of unicorn?
I'm using Rack::Timeout to time out before unicorn. Unicorn timeout uses kill -9, and I don't think that gives you any way to do anything.

PG::Error EOF detected on Heroku Cedar, rails 3.2.11

Having experienced a few periods of downtime, we've recently upgraded to a production environment in Heroku (Crane database plus 2 x web dynos) however we've seen no improvement. In fact reliability seems to have decreased since upgrading.
The root cause seems to be the following exception:
PG::Error (SSL SYSCALL error: EOF detected
which causes the dyno to fail and - eventually - restart, but not before causing some downtime.
I've no idea what's causing it. Common culprits appear to be Resque and Unicorn, neither of which I'm using. We're on rails 3.2.11, on Heroku Cedar, using pg gem 1.14.1
Logs report the following at crash time:
2013-05-23T19:01:33+00:00 app[heroku-postgres]: source=HEROKU_POSTGRESQL_PINK measure.current_transaction=34490 measure.db_size=38311032bytes measure.tables=19 measure.active-connections=7 measure.waiting-connections=0 measure.index-cache-hit-rate=0.99438 measure.table-cache-hit-rate=0.8824
2013-05-23T19:01:35.123633+00:00 app[web.2]:
2013-05-23T19:01:35.123633+00:00 app[web.2]: PG::Error (SSL SYSCALL error: EOF detected
2013-05-23T19:01:35.123633+00:00 app[web.2]: ):
I have read the following: https://groups.google.com/forum/?fromgroups#!topic/heroku/a6iviwAFgdY but can't find anything that might help.
https://gist.github.com/ktopping/5657474
The above fixes the exception, which is useful (as it should declutter my logs, and even help speed up reconnecting to the database) but doesn't actually stop my main issue which is Heroku web dynos crashing more often than I would like.
Am investigating some other routes (Unicorn, rack-timeout).

Rails oink with Heroku

This question has been asked before but no answer seems to work for me. I will break the problem down into its 3 components:
1) I receive a Heroku R14 memory (memory quota exceeded) occasionally (i.e. the site has been up 2 days on Heroku and I got this error twice for a period of about 10-15 mn [I was too emotional to count the time precisely]).
2) I installed the oink gem as advised by Heroku.
3) Oink definitely logs, as I get messages to that effect in heroku logs and in Webrick when I work locally. However, I am unable to access the logging summary that shows which functions exceed a memory threshold.
The only line that returns a result (but a wrong one) is :
oink --threshold=0 logfile_for_oink
But it returns empty lines as follows:
---- MEMORY THRESHOLD ----
THRESHOLD: 0 MB
-- SUMMARY --
Worst Requests:
Worst Actions:
Aggregated Totals:
Every other attempt - often copying advice already on StackOverflow - returns errors.
I will list the different attempts I have made (so no-one posts a suggestion I may have already tried) after this.
heroku run bundle exec oink --threshold=75 log/*
This line returns the following error:
/app/vendor/bundle/ruby/1.9.1/gems/oink-0.10.1/lib/oink/cli.rb:88:in `block in get_file_listing': Could not find "log/development.log" (RuntimeError)
Every variation on this, such as log/production.rb or /log/* or what have you has failed.
I also tried the advice on the following links to no avail:
Using oink gem with heroku
oink logs command not working on heroku
oink logs command not working on heroku
How can I run oink in heroku?
Can anyone help me?
Heroku prepends the log file with an additional timestamp so oink can't read it. You can use a regex though to fix it.
http://arches.io/2013/07/understand-memory-usage-on-heroku-rails-app-using-oink/

Resources