Heroku and slug size bloat - ruby-on-rails

I'm starting to hit a wall with my Heroku app.
I'm well aware of the normal issues with slug size, re: images, PDFs, and other materials but my problem likely revolves around other assets brought in by bower or possibly build packs.
https://devcenter.heroku.com/articles/slug-compiler
Heroku Slug Size After Multiple Deployments
My Heroku compliled slug looks like so:
$ du -h --max-depth=1
4.0K ./.bower-tmp
30M ./tmp
24K ./features
236K ./config
195M ./public
4.0K ./log
34M ./bin
792K ./db
355M ./vendor
8.0K ./.heroku
22M ./app
64K ./lib
8.0K ./.bundle
136K ./.bower-registry
22M ./.bower-cache
24M ./node_modules
12K ./.profile.d
By far the largest is Vendor (355M), but my local vendor folder is in fact empty as is public (195M).
But on heroku it looks like:
40M vendor/ruby-2.0.0
21M vendor/node
32K vendor/heroku
12K vendor/assets
103M vendor/jvm
192M vendor/bundle
195M public/assets (bower bloat?)
Which I'm guessing is one of several build packs for bower and for PDF generation.
https://github.com/heroku/heroku-buildpack-nodejs
https://github.com/heroku/heroku-buildpack-ruby
https://github.com/razorfly/wkhtmltopdf-buildpack
My app itself looks lean-ish at 22M, but my current heroku SLUG is 298.4MB! and the vendor directory alone is more than that according to du, should I not be using these build packs and instead migrate to asset compilation on my local machine between builds? I'm not sure what a good deployment strategy (/ slug diet) should look like, any ideas would be greatly appreciated.
UPDATE:
I also tried rebuilding the slug from what I read had worked for others, but to no effect. Slug size after compilation remained the same.
heroku plugins:install https://github.com/heroku/heroku-repo.git
heroku repo:rebuild -a appname
GIST of build: https://gist.github.com/holden/b4721fc798bdaddf52c6
UPDATE 2 (after following the excellent idea presented by drorb)
12K ./.profile.d
21M ./app
4.0K ./log
812K ./db
8.0K ./.heroku
236K ./config
195M ./public
19M ./.bower-cache
60K ./lib
253M ./vendor
4.0K ./.bower-tmp
128K ./.bower-registry
34M ./bin
30M ./tmp
24M ./node_modules
24K ./features
8.0K ./.bundle
Vendor
12K vendor/assets
193M vendor/bundle
21M vendor/node
32K vendor/heroku
40M vendor/ruby-2.0.0
Public/Assets (very long)
https://gist.github.com/holden/ee67918c79dd3d197a6b

The size of vendor/jvm is 103M. Since you are not using JRuby the only reason I could find for having it is using the yui-compressor gem. Looking at the heroku-buildpack-ruby it seems that the JVM is installed in this case:
def post_bundler
if bundler.has_gem?('yui-compressor') && !ruby_version.jruby?
install_jvm(true)
ENV["PATH"] += ":bin"
end
end
If you can avoid using yui-compressor you should be able to save 103M on your slug size.

FWIW, we removed Bower from our app and replaced it with the Rails Assets framework. We came to the conclusion that using Bower in a Rails app was somewhat pointless as Bundler essentially serves the same function.

Part of the problem may be pointing at Git repos in your Gemfile. At one point I needed to point at a Rails commit that hadn't been released and it added > 100 MB to my slug size over pointing at a released version.

Related

tar -cf not preserving exact modification time

When creating a tar archive with -c, the modification time seems to be changing, specifically it cuts off the time after the decimal, leaving the modtime to be just the integer value of what it was.
Notice:
```
[localhost] $ mkdir test
[localhost] $ stat test
File: ‘test’
Size: 4096 Blocks: 8 IO Block: 4096 directory
Modify: 2016-07-18 17:01:33.116807520 -0400 # <------ Notice exact time
[localhost] $ tar -cf test.tar test
[localhost] $ tar -xf test.tar
[localhost] $ stat test
File: ‘test’
Size: 4096 Blocks: 8 IO Block: 4096 directory
Modify: 2016-07-18 17:01:33.000000000 -0400 # <------ Notice how time is rounded
(I removed irrelevant parts from output of stat for readability)
I've inquired man tar, but couldn't find an option that'll preserve exact modification time in nanoseconds. Could someone explain why such behavior is occurring? Or is this expected during tar creation.
Update: So far no luck, I tried playing around with tar options but most of options that deal with time are related to a files' access time, and not modtime. The ones that do deal with modtime change the modtime, which isn't something I'm looking for.
Just in case anyone googling the same issue stumbles upon this thread (like I did):
The solution (at least one of them) is to use the -H option, as answered here:
https://unix.stackexchange.com/questions/397130/tar-how-to-preserve-timestamps-down-to-more-than-a-second-of-precision/397132#397132
The tar(1) manpage does not point out the practical implications of the -H arguments at all; I think it would be very helpful if a search for a likely keyword ("nanosecond", "second", "resolution", etc.) led to the paragraph on -H.

While creating a project specific gemset for a new rails application, got this error: .ruby-version is not empty, moving aside to preserve

Specifically, this is what I typed into terminal and what came back:
$ mkdir myapp
$ cd myapp
$ rvm use ruby-2.1.0#myapp --ruby-version --create
ruby-2.1.0 - #gemset created /usr/local/rvm/gems/ruby-2.1.0#myapp
ruby-2.1.0 - #generating myapp wrappers.
Using /usr/local/rvm/gems/ruby-2.1.0 with gemset myapp
.ruby-version is not empty, moving aside to preserve.
.ruby-gemset is not empty, moving aside to preserve.
$ ls -la .ruby*
-rw-rw-r-- 1 danisyellis staff 6 Jan 24 14:26 .ruby-gemset
-rw-rw-r-- 1 danisyellis staff 6 Jan 24 14:26 .ruby-gemset.01.24.2014-14:26:06
-rw-rw-r-- 1 danisyellis staff 11 Jan 24 14:26 .ruby-version
-rw-rw-r-- 1 danisyellis staff 11 Jan 24 14:26 .ruby-version.01.24.2014-14:26:06
$ cat .ruby*
myapp
myapp
ruby-2.1.0
ruby-2.1.0
I've searched the internet for that error message and haven't found anything that explains it so I don't know what it means.
It almost looks like my computer ran the command twice and tried to create a duplicate?
Questions:
Is that what happened or was it something else?
If yes, why did it do that?
What does "moving aside to preserve" mean?
Is there anything I can change/clean up so that everything works properly and cleanly
Thanks so much for any help you can give! I'm pretty new to all this so answers with a decent amount of detail/hand-holding would be appreciated.
this looks like a bug, please report it here: https://github.com/wayneeseguin/rvm/issues
in the mean time ignore this bug, you can remove the duplicate files quite easily:
rm -f .ruby-*\.*

Development Log file exceeds GitHub's file size limit, even after deleting file

I tried to commit some changes in my app, and received an error that the development log was too big at 512MB. I deleted the development log file and tried again, and the same error showed up with a log size of 103.2MB. I also tried rake log:clear with the same error.
Apparently the development log file is getting rewritten. I have never used the logs and would probably not miss them...is there a way to commit to git and not rewrite the development log?
2 files changed, 0 insertions(+), 1096498 deletions(-)
rewrite log/development.log (100%)
rewrite log/production.log (100%)
[master]~/Projects/schoolsapi: git push origin master
Username:
Password:
Counting objects: 26, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (15/15), done.
Writing objects: 100% (17/17), 6.90 MiB | 322 KiB/s, done.
Total 17 (delta 7), reused 0 (delta 0)
remote: Error code: 026c4e06d174bf5b0e51c754dc9459b0
remote: warning: Error GH413: Large files detected.
remote: warning: See http://git.io/iEPt8g for more information.
remote: error: File log/development.log is 103.32 MB; this exceeds GitHub's file size limit of 100 MB
Update after trying suggestions from answers 1 and 2 below:
The problem still exists. I've removed the log file from the git repo, and my local machine, inserted the .gitignore file and updated development.rb with the config logger bit. The last two lines below show that the development.log file does not exist in git or my local machine.
master]~/Projects/schoolsapi: git add .
[master]~/Projects/schoolsapi: git commit -m"Tried config logger per apnea diving"
[master b83b259] Tried config logger per apnea diving
2 files changed, 4 insertions(+), 0 deletions(-)
create mode 100644 .gitignore
[master]~/Projects/schoolsapi: git push origin master
Username:
Password:
Counting objects: 38, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (23/23), done.
Writing objects: 100% (26/26), 6.90 MiB | 525 KiB/s, done.
Total 26 (delta 12), reused 0 (delta 0)
remote: Error code: e69d138ee720f7bcb8112e0e7ec03470
remote: warning: Error GH413: Large files detected.
remote: warning: See http://git.io/iEPt8g for more information.
remote: error: File log/development.log is 103.32 MB; this exceeds GitHub's file size limit of 100 MB
[master]~/Projects/schoolsapi: rm log/development.log
rm: log/development.log: No such file or directory
[master]~/Projects/schoolsapi: git rm log/development.log
fatal: pathspec 'log/development.log' did not match any files
[master]~/Projects/schoolsapi:
UPDATE
I had earlier commits which still had the log/development.log file. Using this code provided by the selected answer below (huge thanks to this person), the problem was fixed with one small caveat:
git filter-branch --index-filter 'git rm --cached --ignore-unmatch log/development.log' --tag-name-filter cat -- --all
The caveat is that I had to use git push origin +master to override git's automatic rejection of non-fast-forward-updates. I was comfortable doing this because I am the only person working on this app. See this question:
Git non-fast-forward rejected
It seems you have earlier added/checked-in your development.log file into the git repo.
You need to remove it, and make a commit.
git rm log/development.log
git commit -m "removed log file"
In general, you should put your log directory into your .gitignore file
echo log >> .gitignore
And to completely remove all the log files (in case others were added)
git rm -r --cached log
git commit -m "removed log file"
Github has recently started enforcing a 100MB limit for maximum file sizes. https://help.github.com/articles/working-with-large-files
Edit:
It seems you have previous commits which weren't pushed to github locally.
Try running
git filter-branch --index-filter 'git rm --cached --ignore-unmatch log/development.log' --tag-name-filter cat -- --all
Side answer to prevent the file to expand on your own disk, simply log to STDOUT in development.rb:
config.logger = Logger.new(STDOUT)
You'll keep logs on your server page, but the file won't be populated anymore.

Setting up thinking sphinx after server reboot (Rails project)

Problem:
I am trying to get sphinx running again after server reboot. There seems to be no sphinx.conf file when I try to start it running:
>searchd
Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
FATAL: no readable config file (looked in /etc/sphinxsearch/sphinx.conf, ./sphinx.conf).
I have run:
rake thinking_sphinx:configure
rake thinking_sphinx:index
rake thinking_sphinx:start
The problem is for some reason no etc/sphinxsearch/sphinx.conf file is being created... I am new to thinking_sphinx and this might not be the only problem (with the site), but it doesn't seem to be set up fully. For out put and more information read below:
Background info:
I am working on a project I didn't set up initially. We rebooted the server to see some of the changes we made in a constants file. But after the reboot the project no longer displays when you navigate to the site. When you put in the straight ip address it just says "Welcome to Nginx".
The port is open and working through our hosting server, so I was told I have to restart some services. One of the issues I came upon was with thinking_sphinx. This was the rake tasks for sphinx site I referenced. As well as common configuration issues for sphinx.
I set up the sphinx.yml development paths (we aren't using production). Then I ran
>rake thinking_sphinx:index
which seems to have worked even though it output some warnings:
Generating Configuration to /home/potato/streetpotato/config/development.sphinx.conf
(0.2ms) SELECT ##global.sql_mode, ##session.sql_mode;
Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file '/home/potato/streetpotato/config/development.sphinx.conf'...
indexing index 'bar_core'...
WARNING: collect_hits: mem_limit=0 kb too low, increasing to 14080 kb
collected 249 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 249 docs, 32394 bytes
total 0.254 sec, 127298 bytes/sec, 978.49 docs/sec
indexing index 'bar_delta'...
WARNING: collect_hits: mem_limit=0 kb too low, increasing to 14080 kb
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.003 sec, 0 bytes/sec, 0.00 docs/sec
skipping non-plain index 'bar'...
indexing index 'synonym_core'...
WARNING: collect_hits: mem_limit=0 kb too low, increasing to 13568 kb
collected 3 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 3 docs, 103 bytes
total 0.003 sec, 30356 bytes/sec, 884.17 docs/sec
indexing index 'synonym_delta'...
WARNING: collect_hits: mem_limit=0 kb too low, increasing to 13568 kb
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.002 sec, 0 bytes/sec, 0.00 docs/sec
skipping non-plain index 'synonym'...
indexing index 'user_core'...
WARNING: collect_hits: mem_limit=0 kb too low, increasing to 13568 kb
collected 100 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 100 docs, 3146 bytes
total 0.013 sec, 239348 bytes/sec, 7608.03 docs/sec
skipping non-plain index 'user'...
total 11 reads, 0.000 sec, 3.8 kb/call avg, 0.0 msec/call avg
total 37 writes, 0.000 sec, 2.5 kb/call avg, 0.0 msec/call avg
Then I ran
>rake thinking_sphinx:configure
Generating Configuration to /home/potato/streetpotato/config/development.sphinx.conf
(0.2ms) SELECT ##global.sql_mode, ##session.sql_mode;
Lastly running:
>rake thinking_sphinx:start
Started successfully (pid 29623).
Now even though my log says:
[Fri Nov 16 19:34:29.820 2012] [29623] accepting connections
There is still no sphinx.conf file being generated and when I try to use the searchd command it still gives me the error...
>searchd --stop
Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
FATAL: no readable config file (looked in /etc/sphinxsearch/sphinx.conf, ./sphinx.conf).
I am at a loss, I know this is super long but only because I am so lost and trying to give as much information as possible. I got further then I did yesterday with this but it still doesn't seem to be fully working. I might have to do more set up with unicorn or thin as well. I'm just trying to figure out how to get the site back up and running again... If any one has run into similar issues with their site going down after reboot and got it back up (specifically a rails project on Nginx and unicorn or thin using sphinx) any insight would be appreciated.
Thanks,
Alan
Calm down!! :-)
Firstly, you don't need a /etc/sphinxsearch/sphinx.conf file; that is just the default file that searchd tries to use when you don't specify any configuration file.
As your log output shows, your rails application is using /home/potato/streetpotato/config/development.sphinx.conf file when it starts the searchd process.
Run ps -fe | grep searchd on your dev machine; you should see something like this as the output:
501 14128 1 0 0:00.00 ttys004 0:00.00 searchd --pidfile --config /home/potato/streetpotato/config/development.sphinx.conf
501 14130 13546 0 0:00.00 ttys004 0:00.01 grep searchd
So rails app calls searchd with --config /home/potato/streetpotato/config/development.sphinx.conf argument, to specify a different conf file.
From your logs, it is clear that thinkingsphinx is running fine. You can confirm it further by logging into rails console and running a search method on one of the models which have thinking_sphinx indexes defined on them.
Eg: If your app has an Article model as shown in the above link, the following command will show all articles having National Parks in them:
$ rails console
> Article.search( "National Parks" )
=> [#<Article id: 15,... >, #<Article id: 22,...>,...]
The real problem is the application not showing after restarting the server. That has nothing to do with thinking sphinx which is running fine.
Try rolling back all the changes made in the constants file that you mention above, and make sure the application is working fine. Then start making the changes one by one and isolate the one change that breaks your application.
So yeah, this is a hole in ThinkingSphinx (IMHO) -- you can start the sphinxd server using the various rake tasks (which generate the config as needed) ... but this doesn't work in production.
On a project I worked in last year (running on a Linux server) we created an /etc/init.d script to start sphinxd -- it takes options including a path to the configuration file. We did our deploys with capistrano, and put generated code in app/shared -- a directory outside of the source tree. I believe there are some predefined capistrano tasks that will rebuild the Rails-specific config files when models change or otherwise affect what Sphinx does (same as the rake task you mention).
This was one of those cases for us where we had been putting off site search for a long time, and one of our developers got it "all set up" in an afternoon. Getting it deployed took a lot more work.
(Just saw answer from #prakash-murthy -- he provides some details of how to specify config path when you initialized sphinxd). But the trick is to have it start when the system starts and pointing to the config that ThinkingSphinx generates.)
Ok so after a day n a half I finally set it all up and got it running (it was more then just sphinx). I also had to get nginx and unicorn up and running in the background, since we didn't have scripts set up to restart them when the server was rebooted...
When rebooting the server you have to restart some services before the app will be accessible:
1) thinking_sphinx
reference sites
http://pat.github.com/ts/en/rake_tasks.html
http://www.claytonlz.com/2010/09/thinkingsphinx-conf-problems/
a)create/modify app/config/sphinx.yml
development:
morphology: stem_en
port: 9312
bin_path: "/usr/bin" # set up the path to binary for searchd
searchd_binary_name: searchd
indexer_binary_name: indexer
#mem_limit: 128M
test:
morphology: stem_en
port: 9312
mem_limit: 128M
production:
morphology: stem_en
port: 9312
mem_limit: 512M
# the searchd ip, in case it's not on localhost
# address: 10.10.0.0
# this is by default included in db/sphinx
# searchd_file_path: "/path/to/shared/folder/sphinx"
b)rake thinking_sphinx:index
c)rake thinking_sphinx:configure # creates config/development.sphinx.conf which helps define sphinx's indexing
d)# then you have to start sphinx, there are 2 ways to do this
rake thinking_sphinx:start
rake thinking_sphinx:stop
OR
searchd
searchd --stop
# only the rake commands worked for me, when I tried to run searchd
# I got an error FATAL: no readable config file (looked in /etc/sphinxsearch/sphinx.conf, ./sphinx.conf).
# for some reason we dont have a sphinx.conf file, but the rake commands work without it
e)# once you start thinking_sphinx check log/searchd.log file for the line
[Fri Nov 16 19:34:29.820 2012] [29623] accepting connections
2) nginx
reference site:
http://wiki.nginx.org/CommandLine
a) check that nginx is up and running
i) start server
# to check where nginx resides type in this into server console
which nginx
# whatever path it gives you is how you start the server this is my path
/usr/sbin/nginx
ii) stop server
/usr/sbin/nginx -s stop # use the path given by which command
3) unicorn (starting app server)
reference site:
http://codelevy.com/2010/02/09/getting-started-with-unicorn.html
a) test if unicorn will run after previous changes
unicorn_rails -p 3000
# the site should now be up and running, check that it is
# console should now log the different actions you do on the site
b) create unicorn.rb in config folder (if none is there)
# only start this step if the step above got the site running
# close the console or exit the process you started above
# contents of unicorn.rb
worker_processes 2 #(starts 2 child processes, not completely neccissary)
preload_app true
timeout 30
listen 3000
after_fork do |server, worker|
ActiveRecord::Base.establish_connection
end
c) run unicorn in the background
# make sure you exited the process above before running this
unicorn_rails -c config/unicorn.rb -D
# this was giving me an error that it said was logged by stderr
# I got the command to run by adding a command to the front
http://stackoverflow.com/questions/2325152/check-for-stdout-or-stderr
exec 2> /dev/null unicorn_rails -c config/unicorn.rb -D
d) (optional) check stats from starting unicorn
i) pgrep -lf unicorn_rails
#sample output
5374 unicorn_rails master -c config/unicorn.rb -D
5388 unicorn_rails worker[0] -c config/unicorn.rb -D # not needed currently
5391 unicorn_rails worker[1] -c config/unicorn.rb -D # not needed currently
ii) cat tmp/pids/unicorn.pid # from inside the streetpotato folder
#sample output
5374

Unicorn Eating Memory

I have a m1.small instance in amazon with 8GB hard disk space on which my rails application runs. It runs smoothly for 2 weeks and after that it crashes saying the memory is full.
App is running on rails 3.1.1, unicorn and nginx
I simply dont understand what is taking 13G ?
I killed unicorn and 'free' command is showing some free space while df is still saying 100%
I rebooted the instance and everything started working fine.
free (before killing unicorn)
total used free shared buffers cached
Mem: 1705192 1671580 33612 0 321816 405288
-/+ buffers/cache: 944476 760716
Swap: 917500 50812 866688
df -l (before killing unicorn)
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8256952 7837520 4 100% /
none 847464 120 847344 1% /dev
none 852596 0 852596 0% /dev/shm
none 852596 56 852540 1% /var/run
none 852596 0 852596 0% /var/lock
/dev/xvda2 153899044 192068 145889352 1% /mnt
/dev/xvdf 51606140 10276704 38707996 21% /data
sudo du -hc --max-depth=1 (before killing unicorn)
28K ./root
6.6M ./etc
4.0K ./opt
9.7G ./data
1.7G ./usr
4.0K ./media
du: cannot access `./proc/27220/task/27220/fd/4': No such file or directory
du: cannot access `./proc/27220/task/27220/fdinfo/4': No such file or directory
du: cannot access `./proc/27220/fd/4': No such file or directory
du: cannot access `./proc/27220/fdinfo/4': No such file or directory
0 ./proc
14M ./boot
120K ./dev
1.1G ./home
66M ./lib
4.0K ./selinux
6.5M ./sbin
6.5M ./bin
4.0K ./srv
148K ./tmp
16K ./lost+found
20K ./mnt
0 ./sys
253M ./var
13G .
13G total
free (after killing unicorn)
total used free shared buffers cached
Mem: 1705192 985876 **719316** 0 365536 228576
-/+ buffers/cache: 391764 1313428
Swap: 917500 46176 871324
df -l (after killing unicorn)
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8256952 7837516 8 100% /
none 847464 120 847344 1% /dev
none 852596 0 852596 0% /dev/shm
none 852596 56 852540 1% /var/run
none 852596 0 852596 0% /var/lock
/dev/xvda2 153899044 192068 145889352 1% /mnt
/dev/xvdf 51606140 10276704 38707996 21% /data
unicorn.rb
rails_env = 'production'
working_directory "/home/user/app_name"
worker_processes 5
preload_app true
timeout 60
rails_root = "/home/user/app_name"
listen "#{rails_root}/tmp/sockets/unicorn.sock", :backlog => 2048
# listen 3000, :tcp_nopush => false
pid "#{rails_root}/tmp/pids/unicorn.pid"
stderr_path "#{rails_root}/log/unicorn/unicorn.err.log"
stdout_path "#{rails_root}/log/unicorn/unicorn.out.log"
GC.copy_on_write_friendly = true if GC.respond_to?(:copy_on_write_friendly=)
before_fork do |server, worker|
ActiveRecord::Base.connection.disconnect!
##
# When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
# immediately start loading up a new version of itself (loaded with a new
# version of our app). When this new Unicorn is completely loaded
# it will begin spawning workers. The first worker spawned will check to
# see if an .oldbin pidfile exists. If so, this means we've just booted up
# a new Unicorn and need to tell the old one that it can now die. To do so
# we send it a QUIT.
#
# Using this method we get 0 downtime deploys.
old_pid = "#{rails_root}/tmp/pids/unicorn.pid.oldbin"
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
end
after_fork do |server, worker|
ActiveRecord::Base.establish_connection
worker.user('rails', 'rails') if Process.euid == 0 && rails_env == 'production'
end
i've just released 'unicorn-worker-killer' gem. This enables you to kill Unicorn worker based on 1) Max number of requests and 2) Process memory size (RSS), without affecting the request.
It's really easy to use. No external tool is required. At first, please add this line to your Gemfile.
gem 'unicorn-worker-killer'
Then, please add the following lines to your config.ru.
# Unicorn self-process killer
require 'unicorn/worker_killer'
# Max requests per worker
use Unicorn::WorkerKiller::MaxRequests, 10240 + Random.rand(10240)
# Max memory size (RSS) per worker
use Unicorn::WorkerKiller::Oom, (96 + Random.rand(32)) * 1024**2
It's highly recommended to randomize the threshold to avoid killing all workers at once.
I think you are conflating memory usage and disk space usage. It looks like Unicorn and its children were using around 500 MB of memory, you look at the second "-/+ buffers/cache:" number to see the real free memory. As far as the disk space goes, my bet goes on some sort of log file or something like that going nuts. You should do a du -h in the data directory to find out what exactly is using so much storage. As a final suggestion, it's a little known fact that Ruby never returns memory back to the OS if it allocates it. It DOES still use it internally, but once Ruby grabs some memory the only way to get it to yield the unused memory back to the OS is to quit the process. For example, if you happen to have a process that spikes your memory usage to 500 MB, you won't be able to use that 500 MB again, even after the request has completed and the GC cycle has run. However, Ruby will reuse that allocated memory for future requests, so it is unlikely to grow further.
Finally, Sergei mentions God to monitor the process memory. If you are interested in using this, there is already a good config file here. Be sure to read the associated article as there are key things in the unicorn config file that this god config assumes you have.
As Preston mentioned you don't have a memory problem (over 40% free), you have a disk full problem. du reports most of the storage is consumed in /root/data.
You could use find to identify very large files, eg, the following will show all files under that dir greater than 100MB in size.
sudo find /root/data -size +100M
If unicorn is still running, lsof (LiSt Open Files) can show what files are in use by your running programs or by a specific set of processes (-p PID), eg:
sudo lsof | awk '$5 ~/REG/ && $7 > 100000000 { print }'
will show you open files greater than 100MB in size
You can set up god to watch your unicorn workers and kill them if they eat too much memory. Unicorn master process will then fork another worker to replace this one. Problem worked around. :-)
Try removing newrelic for your app if you are using newrelic. Newrelic rpm gem itself leaking the memory. I had the same issue and I stratched my head for almost 10day to figure out the issue.
Hope that help you.
I contact newrelic support team and below is their reply.
Thanks for contacting support. I am deeply sorry for the frustrating
experience you have had. As a performance monitoring tool, our
intention is "first do no harm", and we take these kind of issues very
seriously.
We recently identified the cause of this issue and have released a
patch to resolve it. (see https://newrelic.com/docs/releases/ruby). We
hope you'll consider resuming monitoring with New Relic with this fix.
If you are interested in doing so, make sure you are using at least
v3.6.8.168 from now on.
Please let us know if you have any addition questions or concerns.
We're eager to address them.
Even if I tried update newrelic gem but it still leaking the memory. Finally I have to remove the rewrelic although it is a great tool but we can not use it at such cost(memory leak).
Hope that help you.

Resources