Heroku memory usage high in rails app - ruby-on-rails

I have a Ruby on Rails application deployed with Heroku; it has one Standard-1X web dyno (512MB) and one Standard-1X worker dyno (512MB). I use Puma and Redis (with RedisToGo) and Sidekiq for background jobs.
I regularly check the metrics for traffic and memory usage, etc. in the Heroku dashboard. I'm a little confused because my app seems to be using a high amount of memory considering its activity.
Every day it really only gets a few visits from a couple users and a few visits from web crawlers. Despite this low amount of traffic, my web dyno memory usage is pretty high. On most days, the graph looks like this for the web dyno memory usage:
As you can see, it sits at around 256MB all day (the dip on the left is a daily dyno restart).
On the other hand, the worker dyno's average memory usage total is about 112 MB.
Is there an explanation for these relatively high memory usages? Or are these simply typical for a deployed application? I've looked at other answers on StackOverflow and it doesn't look like a memory leak.
In case it helps, here is my Procfile and my Puma and Redis initializers.
Procfile
web: bundle exec puma -C config/puma.rb
worker: bundle exec sidekiq -c 5 -v
Puma.rb
workers Integer(ENV['WEB_CONCURRENCY'] || 2)
threads_count = Integer(ENV['RAILS_MAX_THREADS'] || 5)
threads threads_count, threads_count
preload_app!
rackup DefaultRackup
port ENV['PORT'] || 3000
environment ENV['RACK_ENV'] || 'development'
on_worker_boot do
ActiveRecord::Base.establish_connection
end
Redis.rb
uri = ENV["REDISTOGO_URL"] || "redis://localhost:6379/"
REDIS = Redis.new(:url => uri)
Thanks in advance for any tips.

Related

Best puma config for local development

Question
What is the ideal puma config in development mode for optimizing performance?
Background
Our rails app runs annoyingly slow in development mode; the server will frequently freeze up and have to be manually terminated using kill -9 [process number], and refreshing the app in the browser on code change can take a long time. We are a json api on rails 6 using puma, and our puma config is as follows. We also use active_admin, which I mention because it is the only part of our app that really uses the asset pipeline and is the slowest part of our app to do development on (refreshing admin app after code change takes the longest).
threads_count = ENV.fetch('RAILS_MAX_THREADS') { 5 }
threads threads_count, threads_count
# Specifies the `port` that Puma will listen on to receive requests; default is 3000.
#
port ENV.fetch('PORT') { 3000 }
# Specifies the `environment` that Puma will run in.
#
environment ENV.fetch('RAILS_ENV') { 'development' }
We have done puma -w 5 to increase the number of workers, which in theory should increase throughput and the potential number of db connections (correct me if this is wrong), but it hasn't had a noticeable impact on performance of the app. From what I've read doing puma -w 5 is also not a great idea during development because it means every time you change some code each worker has to reload the entire app, which would mean it is better to just run one worker locally so the app only has to reload one time on code change. It's possible that this slowness does not have to do with our puma config and that it is perhaps another issue, but regardless I would love to hear thoughts on how to best configure puma to run optimally in development mode.
Thanks in advance for the help.

In Puma, how do I calculate DB connections?

I'm trying to figure out how many database connections my app will use.
It's Rails 5 hosted on Heroku.
Here is my Puma config
workers Integer(ENV['WEB_CONCURRENCY'] || 2)
threads_count = Integer(ENV['RAILS_MAX_THREADS'] || 5)
threads threads_count, threads_count
preload_app!
rackup DefaultRackup
port ENV['PORT'] || 3000
environment ENV['RACK_ENV'] || 'development'
on_worker_boot do
ActiveRecord::Base.establish_connection
end
And the first part of my DB config:
default: &default
adapter: postgresql
encoding: unicode
pool: <%= ENV['RAILS_MAX_THREADS'] || 5 %>
The part that seems strange to me is # of connections, and also my pool setting in database.yml are all using RAILS_MAX_THREADS... but shouldn't it be using RAILS_MAX_THREADS multiplied by the number of workers (WEB_CONCURRENCY?
Actually I found the answer explained well here... https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#database-connections
As you add more concurrency to your application, it will need more connections to your database. A good formula for determining the number of connections each application will require is to multiply the RAILS_MAX_THREADS by the WEB_CONCURRENCY. This combination will determine the number of connections each dyno will consume.
Rails maintains its database connection pool, with a new pool created for each worker process. Threads within a worker will operate on the same pool. Make sure there are enough connections inside of your Rails database connection pool so that RAILS_MAX_THREADS number of connections can be used. If you see this error:
ActiveRecord::ConnectionTimeoutError - could not obtain a database connection within 5 seconds
This error is an indication that your Rails connection pool is too low. For an in-depth look at these topics, please read the Dev Center article Concurrency and Database Connections.

Heroku Rails 4 Puma app spawning extra instance

I'm running a basic Rails 4 (ruby 2.1.4) app on Heroku with a Puma config as follows:
workers Integer(ENV['PUMA_WORKERS'] || 1)
threads Integer(ENV['MIN_THREADS'] || 6), Integer(ENV['MAX_THREADS'] || 6)
I currently do not have any ENV vars set so I should be defaulting to 1 worker.
The problem is, that while investigating a potential memory leak, it appears that 2 'instances' of my web.1 dyno are running, at least according to NewRelic.
I have heroku labs:enable log-runtime-metrics enabled and it shows my memory footprint at ~400MB. On NewRelic it shows my footprint at ~200MB AVG across 2 'instances'.
heroku:ps shows:
=== web (1X): `bundle exec puma -C config/puma.rb`
web.1: up 2014/10/30 13:49:29 (~ 4h ago)
So why would NewRelic think I have 2 instances running? If I do a heroku:restart NewRelic will see only 1 instance for awhile and then bump up to 2. Is this something Heroku is doing but not reporting to me, or is it a Puma thing even though workers should be set to 1.
See the Feb 17, 2015 release of New Relic 3.10.0.279 which addresses this specific issue when tracking Puma instances. I'm guessing since your app is running on Heroku, you have preload_app! set in your Puma config so this should apply.
From the release notes:
Metrics no longer reported from Puma master processes.
When using Puma's cluster mode with the preload_app! configuration directive, the agent will no longer start its reporting thread in the Puma master process. This should result in more accurate instance counts, and more accurate stats on the Ruby VMs page (since the master process will be excluded).
I'm testing the updated on a project with a simular issue and it seems to be reporting more accurately.
It's because Puma has always 1 master process from which all the workers are spawned.
So, the instance count will come from the following:
1 (master process) + <N_WORKER>

Single-dyno setup on Heroku for Rails with both WebSocket and worker queue

At the moment I have a small web application running on Heroku with a single dyno. This dyno runs a Rails app on Unicorn with a single worker queue.
config/unicorn.rb:
worker_processes 1
timeout 180
#resque_pid = nil
before_fork do |server, worker|
#resque_pid ||= spawn("bundle exec rake jobs:work")
end
I would like to add WebSockets functionality, but from what I have read Unicorn is not one of the web servers that supports faye-websocket. There is the Rainbows! web server that is based on Unicorn, but I'm unclear if it's possible for me to switch and keep my spawn for the queue worker.
I suppose having more than a single dyno one could just add another dyno to run a Rainbows! web server for the WebSockets part, right? This is unfortunately not an option at the moment. Is there a way to get it working with a single dyno for my setup?
If not, what other options are available to get information from server to client e.g. based on asynchronous work being completed? I'm using poll for other things in the application, i.e. to start an asynchronous job that is handled by the worker process and upon completion the polling client (browser) will see a completion flag. This works, but I'd like to improve it if possible.
I'm open to hear about your experiences and suggestions. Thanks in advance!

What "workers" really are?

I feel lost.
Nginx has it's own "Worker" processes,
Unicorn has it's own "Worker" settings,
Resque has it's own "Workers".
Unicorn's settings should be related to Nginx's or Resque's I guess?
I really searched for a clue but didn't got any.
Are all of these "workers" same?
If not can you briefly tell what are they?
Nginx - Nginx is the web server that gets the incoming requests and serves to unicorns on request.
Unicorn - Each unicorn worker loads a separate Rails environment(Worker).
Resque - Each Resque worker loads a separate Rails environment(Worker).
The purpose of Unicorn and Resque are different.
Unicorn serves the web requests.
Resque gets background jobs from Redis and processes it

Resources