Rails 5 - heroku puma - random timeout? - ruby-on-rails

I've recently switch from unicorn to puma.
As expected, the performances are better !
However, there is something happening that I dont understand.
All the time, my API is handling thing well for the 95 percentile and below.
But, for no apparent reason, from time to time, I've got some strange Timeout from heroku on the 99 percentile (over 30seconds of waiting time)
Here in the screenshot, the big rectangles are timeout on the 99th percentile...
So I've check me log reader, and some random routes are randomly in timeout, so this is not a question of API performance.
Also !
I have the rack timeout gem
with the following configuration :
# config/initializers/rack_timeout.rb
Rack::Timeout.service_timeout = 25 # seconds
but the timeout are never raised here, only in H12 on heroku (so after 30sec)
This happens from the moment I started using Puma
Any idea ?
API configuration :
ruby 2.5.1
rails 5.1.4
PUMA configuration
gem version 3.11.4
3 workers
5 threads
preload_app!

Related

Pry session timing out because of Puma worker timeout

Whenever I hit a binding.pry while running an app locally, I enter the pry session as normal, but after about a minute, I see something like this in my server output.
[54438] ! Terminating timed out worker: 54455
Then the server seems to run in a loop for a second or two (re-running queries that lead to the pry session) and I return to a new pry session from the same binding.pry, except in this new pry session whenever I type I can't see anything I'm typing. The only way to fix this problem is to quit the server and restart.
I've tried inserting the following line in my config/puma.rb file but it doesn't seem to make any difference.
worker_timeout 900 if ENV["RACK_ENV"] == "development"
The only thing that works is setting the number of puma workers I have to 0 in my .env file. E.g
PUMA_WORKERS=0
Is there any way around this issue that doesn't involve just eliminating all puma workers?

Sidekiq, Redis, Rails and Ruby - no implicit conversion of Pathname into String

I am trying to get my sidekiq server and client up and running (using Foreman), but whenever it gets to:
bundle exec sidekiq
The following results:
no implicit conversion of Pathname into String
Just like that, without Type Error preceding it - obviously the stack trace followed (will post if it will help). It says that the problem is in active_support/dependencies.rb (version 5.0.0.1) in the require method. Earlier in the stack trace it gets to boot_system in sidekiq's cli.rb (version 4.1.2). I am not sure whether this is a known issue with sidekiq or whether I am missing some configuration (I have read through a good number of tutorials on this which include thorough discussion of considerations to make in regarding sidekiq, puma and redis' configs, but to no avail). I am running Ruby 2.3.1 and Rails 5.0.0.1
The sidekiq.yml file includes (I got the error before this file and including it did not solve the issue):
development:
:concurrency: 5
production:
:concurrency: 20
:queues:
- default
Also, I am really new to posting on stackoverflow (but have made frequent use of it in the past). Any guidance would be great!
So I did manage to get my sidekiq up and running with redis. My problem was with one of the worker scripts which had an error in it... It was picking it up in a directory other than app/workers (I placed it in app/temp while I was debugging) - only saw it now in the stack trace: obviously missed it earlier due to looking at a screen too long (the classics). Still was weird that the error message was missing Type Error though.

Override 30 seconds timeout on gem class timeout

My thin server is timing out after 30 seconds. I would like to override this ruby file.
DEFAULT_TIMEOUT from 30 seconds to 120 seconds. how to do it? Please let me know.
code is here:
https://github.com/macournoyer/thin/blob/master/lib/thin/server.rb
I would like to override without "already initialized constant" Warnings.
See the help
➜ ~/app ✓ thin --help | grep timeout
-t, --timeout SEC Request or command timeout in sec (default: 30)
So you can change it from the command line when starting the server
➜ ~/app ✓ thin --timeout 60 start
or you can set a config file somewhere like /etc/thin/your_app.yml with something like this
---
timeout: 60
and then run thin, pointing it at this YAML file with
thin -C /etc/thin/your_app.yml start
As a side note, you should consider if increasing your timeout is really necessary. Typically long running requests should be queued up and run later through a service like delayed_job or resque
After seeing your comment and learning you're using Heroku, I suggest you read the documentation
Occasionally a web request may hang or take an excessive amount of time to process by your application. When this happens the router will terminate the request if it takes longer than 30 seconds to complete. The timeout countdown begins when the request leaves the router. The request must then be processed in the dyno by your application, and then a response delivered back to the router within 30 seconds to avoid the timeout.
I even more strongly suggest looking into delayed_job, resque, or similar if you're using Heroku. You will have at least one worker running to handle the queue. HireFire is an excellent service to save you money by only spinning up workers when your queue actually has jobs to process.

PG::Error EOF detected on Heroku Cedar, rails 3.2.11

Having experienced a few periods of downtime, we've recently upgraded to a production environment in Heroku (Crane database plus 2 x web dynos) however we've seen no improvement. In fact reliability seems to have decreased since upgrading.
The root cause seems to be the following exception:
PG::Error (SSL SYSCALL error: EOF detected
which causes the dyno to fail and - eventually - restart, but not before causing some downtime.
I've no idea what's causing it. Common culprits appear to be Resque and Unicorn, neither of which I'm using. We're on rails 3.2.11, on Heroku Cedar, using pg gem 1.14.1
Logs report the following at crash time:
2013-05-23T19:01:33+00:00 app[heroku-postgres]: source=HEROKU_POSTGRESQL_PINK measure.current_transaction=34490 measure.db_size=38311032bytes measure.tables=19 measure.active-connections=7 measure.waiting-connections=0 measure.index-cache-hit-rate=0.99438 measure.table-cache-hit-rate=0.8824
2013-05-23T19:01:35.123633+00:00 app[web.2]:
2013-05-23T19:01:35.123633+00:00 app[web.2]: PG::Error (SSL SYSCALL error: EOF detected
2013-05-23T19:01:35.123633+00:00 app[web.2]: ):
I have read the following: https://groups.google.com/forum/?fromgroups#!topic/heroku/a6iviwAFgdY but can't find anything that might help.
https://gist.github.com/ktopping/5657474
The above fixes the exception, which is useful (as it should declutter my logs, and even help speed up reconnecting to the database) but doesn't actually stop my main issue which is Heroku web dynos crashing more often than I would like.
Am investigating some other routes (Unicorn, rack-timeout).

Rufus-scheduler only running once on production

I'm using rufus-scheduler to run a process every day from a rails server. For testing purposes, let's say every 5 minutes. My code looks like this:
in config/initializers/task_scheduler.rb
scheduler = Rufus::Scheduler::PlainScheduler.start_new
scheduler.every "10m", :first_in => '30s' do
# Do stuff
end
I've also tried the cron format:
scheduler.cron '50 * * * *' do
# stuff
end
for example, to get the process to run every hour at 50 minutes after the hour.
The infuriating part is that it works on my local machine. The process will run regularly and just work. It's only on my deployed-to-production app that the process will run once, and not repeat.
ps faux reveals that cron is running, passenger is handling the spin-up of the rails process, the site has been pinged again so it knows it should refresh, and production shows the changes in the code. The only thing that's different is that, without a warning or error, the scheduled task doesn't repeat.
Help!
You probably shouldn't run rufus-scheduler in the rails server itself, especially not with a multi-process framework like passenger. Instead, you should run it in a daemon process.
My theory on what's happening:
Passenger starts up a ruby server process and uses it to fork off other servers to handle requests. But since rufus-scheduler runs its jobs in a separate thread from the main thread, the rufus thread is only alive in the original ruby process (ruby's fork only duplicates the thread that does the forking). This might seem like a good thing because it prevents multiple schedulers from running, but... Passenger may kill ruby processes under certain conditions - and if it kills the original, the scheduler thread is gone.
Add the Below lines to your apache2 config /etc/apache2/apach2.conf and restart your apache server
RailsAppSpawnerIdleTime 0
PassengerMinInstances 1
Kelvin is right.
Passenger kills 'unnecessary' threads.
http://groups.google.com/group/rufus-ruby/search?group=rufus-ruby&q=passenger

Resources