memory leak issue in rails and phusion passenger

memory leak issue in rails and phusion passenger - ruby-on-rails

The rails site was running so slow that I had to reboot the os, but after only 1 hour after rebooting ubuntu, the system was incredible slow again, so I checked the passenger memory statistics:
------ Passenger processes ------
PID VMSize Private Name
---------------------------------
1076 215.8 MB 0.3 MB PassengerWatchdog
1085 2022.3 MB 4.4 MB PassengerHelperAgent
1089 109.4 MB 6.4 MB Passenger spawn server
1093 159.2 MB 0.8 MB PassengerLoggingAgent
1883 398.1 MB 110.1 MB Rack: /home/guarddog/public_html/guarddog.com/current
1906 1174.6 MB 885.9 MB Rack: /home/guarddog/public_html/guarddog.com/current
3648 370.1 MB 131.9 MB Rack: /home/guarddog/public_html/guarddog.com/current
14830 370.6 MB 83.0 MB Rack: /home/guarddog/public_html/guarddog.com/current
15124 401.2 MB 113.9 MB Rack: /home/guarddog/public_html/guarddog.com/current
15239 413.5 MB 127.7 MB Rack: /home/guarddog/public_html/guarddog.com/current
15277 400.5 MB 113.6 MB Rack: /home/guarddog/public_html/guarddog.com/current
15285 357.1 MB 70.1 MB Rack: /home/guarddog/public_html/guarddog.com/current
### Processes: 12
### Total private dirty RSS: 1648.10 MB
It boggles my mind how that one rack process is using 885.9 MB of private dirty RSS memory after one hour of rebooting the OS when usage was down to 100 mb total. Now one hour later it's at 1648.10 mb. The site is so slow the page won't even load.
I assume it's a memory leak so I added this line of code throughout the application:
puts "Object count: #{ObjectSpace.count_objects}"
But I do not know how to interpret the data it gives me:
Object count: {:TOTAL=>2379635, :FREE=>318247, :T_OBJECT=>35074, :T_CLASS=>6707, :T_MODULE=>1760, :T_FLOAT=>174, :T_STRING=>1777046, :T_REGEXP=>2821, :T_ARRAY=>75160, :T_HASH=>64227, :T_STRUCT=>774, :T_BIGNUM=>7, :T_FILE=>7, :T_DATA=>55075, :T_MATCH=>10, :T_COMPLEX=>1, :T_RATIONAL=>63, :T_NODE=>37652, :T_ICLASS=>4830}
Note I only running that ObjectSpace.count_objects on my local machine since my ubuntu server is so slow it cannot even load the page.
Here's some other OS statistics:
$ cat /proc/meminfo
MemTotal: 6113156 kB
MemFree: 3027204 kB
Buffers: 103540 kB
Cached: 261544 kB
SwapCached: 0 kB
Active: 2641168 kB
Inactive: 248316 kB
Active(anon): 2524652 kB
Inactive(anon): 328 kB
Active(file): 116516 kB
Inactive(file): 247988 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 6287356 kB
SwapFree: 6287356 kB
Dirty: 36 kB
Writeback: 0 kB
AnonPages: 2524444 kB
Mapped: 30108 kB
Shmem: 568 kB
Slab: 77268 kB
SReclaimable: 48528 kB
SUnreclaim: 28740 kB
KernelStack: 4648 kB
PageTables: 43044 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 9343932 kB
Committed_AS: 5179468 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 293056 kB
VmallocChunk: 34359442172 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 46848 kB
DirectMap2M: 5195776 kB
df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/roadrunner-root 134821120 22277596 105695012 18% /
udev 3047064 4 3047060 1% /dev
tmpfs 1222632 252 1222380 1% /run
none 5120 0 5120 0% /run/lock
none 3056576 88 3056488 1% /run/shm
none 102400 0 102400 0% /run/user
/dev/sda1 233191 29079 191671 14% /boot
On a side note, if I run "kill -9 1906" to kill that one rack process consuming so much memory, would that help?

First of all, hot fix the immediate production problem, implement a watchdog - http://dev.mensfeld.pl/2012/08/simple-rubyrails-passenger-memory-consumption-limit-monitoring/ and then hunt for the leak, or bloat (https://blog.engineyard.com/2009/thats-not-a-memory-leak-its-bloat)
That process you showed looks like a regular worker, try killing the offending process under controlled conditions, see what happens, probably nothing bad.
See if you can correlate this happening with a certain (often long running) controller action, or apache restarts/reloads (have this problem on regular basis, 1 in 20 processes goes bonkers after restart).
Extend rails logs so they contain PIDs (https://gist.github.com/krutten/1091611 for example) and make a simple script that dumps memory use into a file every minute or so (make sure you don't fill your disk) - this will enable you to know exactly when a process started bloating and then trace in the logs what it was doing before/as this happened

Related

embedded Linux RAM usage and confusion

I am trying to optimize RAM for an embedded Linux.
Running in same hardware, pretty much same software one is old version (yocto morty) and another is new version (Yocto dunfell).
free for new version
total used free shared buff/cache available
Mem: 57568 41020 2036 324 14512 12520
Swap: 0 0 0
free for old version
total used free shared buff/cache available
Mem: 57496 34120 2588 284 20788 19516
Swap: 0 0 0
I can see used memory increase about 7M for new version.
when I do ps axo pid,rss,cmd | awk '{sum+=$2} END {print sum }'
and sum up all the RSS value for my system.
I found my new version(83652K) actual less than old version(124240K). Nearly every process RSS is smaller than old version.
It does not make sense to me.
So I use the cat /proc/meminfo to get more detail.
I found the key different is Active(anon) and AnonPages. New version increases 6M.
But I don't know where this 6M comes from and how to trace it.
Thanks
*****************new version***********************
MemTotal: 57568 kB
MemFree: 1460 kB
MemAvailable: 14752 kB
Buffers: 132 kB
Cached: 11760 kB
SwapCached: 0 kB
Active: 31456 kB
Inactive: 4044 kB
Active(anon): 23720 kB
Inactive(anon): 212 kB
Active(file): 7736 kB
Inactive(file): 3832 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 12 kB
Writeback: 0 kB
AnonPages: 23628 kB
Mapped: 6712 kB
Shmem: 324 kB
Slab: 10464 kB
SReclaimable: 5300 kB
SUnreclaim: 5164 kB
KernelStack: 1496 kB
PageTables: 1948 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 28784 kB
Committed_AS: 651316 kB
VmallocTotal: 958464 kB
VmallocUsed: 1228 kB
VmallocChunk: 951436 kB
***********old version**************************************************
MemTotal: 57496 kB
MemFree: 1996 kB
MemAvailable: 19172 kB
Buffers: 908 kB
Cached: 12092 kB
SwapCached: 0 kB
Active: 30168 kB
Inactive: 812 kB
Active(anon): 18048 kB
Inactive(anon): 216 kB
Active(file): 12120 kB
Inactive(file): 596 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 52 kB
Writeback: 0 kB
AnonPages: 17984 kB
Mapped: 5984 kB
Shmem: 284 kB
Slab: 14188 kB
SReclaimable: 8036 kB
SUnreclaim: 6152 kB
KernelStack: 1480 kB
PageTables: 2120 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 28748 kB
Committed_AS: 649728 kB
VmallocTotal: 958464 kB
VmallocUsed: 1228 kB
VmallocChunk: 951556 kB

Used memory in /proc/meminfo does not adds up

Looks at the output of cat /proc/meminfo I have MemTotal = 38GB, and Active + Inactive = 14 Gb. Cached is only 3GB
What is using most of my RAM on top of those 14GB of Active and Inactive ?
How does it reach the 31Gb of used Mem that free shows?
I would expect to have Memavailable of about 38-14-3 = 11 GB, instead Memavailable is 3.8GB
explanation of the output Entry in /proc/meminfo and https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/s2-proc-meminfo
> cat /proc/meminfo
MemTotal: 38109840 kB
MemFree: 955788 kB
MemAvailable: 3872636 kB
Buffers: 112176 kB
Cached: 3477880 kB
SwapCached: 0 kB
Active: 11723512 kB = Active(anon) + Active(file)
Inactive: 2919292 kB = Inactive(anon) + Inactive(file)
Active(anon): 11264332 kB
Inactive(anon): 159048 kB
Active(file): 459180 kB
Inactive(file): 2760244 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 580 kB
Writeback: 0 kB
AnonPages: 11053268 kB
Mapped: 321412 kB
Shmem: 370524 kB
Slab: 758444 kB
SReclaimable: 198992 kB
SUnreclaim: 559452 kB
KernelStack: 16608 kB
PageTables: 32104 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 19054920 kB
Committed_AS: 13568652 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 159728 kB
DirectMap2M: 16617472 kB
DirectMap1G: 24117248 kB
I do have additional info
> free -mh
total used free shared buff/cache available
Mem: 36G 31G 1.7G 337M 3.6G 4.5G
Swap: 0B 0B 0B
> top
top - 16:14:09 up 23:43, 1 user, load average: 14.65, 19.58, 18.87
Tasks: 266 total, 2 running, 263 sleeping, 1 stopped, 0 zombie
%Cpu(s): 36.2 us, 5.1 sy, 0.0 ni, 58.3 id, 0.0 wa, 0.0 hi, 0.5 si, 0.0 st
KiB Mem : 38109840 total, 964388 free, 33356576 used, 3788876 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 3881104 avail Mem

Gitlab-cl server and memory usage

we are running gitlab-cl 10.0.1 installed from repository on centos 6.9
We have a physical server with 65GB of RAM.
We had slow performances on the web interface, so looking at the memory we saw that the server is swapping a bit and all the memory is used.
There is no active process using it and free -m confirms it is cached :
total used free shared buffers cached
Mem: 64412 64179 232 140 1 176
-/+ buffers/cache: 64001 410
Swap: 15999 2679 13320
The strange thing is that all memory is allocated on DirectMap2M
cat /proc/meminfo
MemTotal: 65957916 kB
MemFree: 242364 kB
Buffers: 1132 kB
Cached: 193548 kB
SwapCached: 853032 kB
Active: 6302692 kB
Inactive: 1729836 kB
Active(anon): 6276560 kB
Inactive(anon): 1704824 kB
Active(file): 26132 kB
Inactive(file): 25012 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 16383996 kB
SwapFree: 13580524 kB
Dirty: 1576 kB
Writeback: 0 kB
AnonPages: 7595904 kB
Mapped: 162376 kB
Shmem: 144312 kB
Slab: 57184100 kB
SReclaimable: 35132 kB
SUnreclaim: 57148968 kB
KernelStack: 12912 kB
PageTables: 59144 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 49362952 kB
Committed_AS: 18168608 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 395428 kB
VmallocChunk: 34323721400 kB
HardwareCorrupted: 0 kB
AnonHugePages: 3260416 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 7652 kB
DirectMap2M: 67088384 kB
Do you know why this is happening?
Is that normal with gitlab?
I read about few commands to remove cache from the memory :
# sync; echo 1 > /proc/sys/vm/drop_caches
# sync; echo 2 > /proc/sys/vm/drop_caches
# sync; echo 3 > /proc/sys/vm/drop_caches
Are they safe to run on a production machine running gitlab?
Thanks a lot

Can't say exactly why all memory is used up but the fact is that it is. At first I expected the standard https://www.linuxatemyram.com/ but I can see that in your case you are actually using 64001 kb of memory.
The commands listed are pointless, what they do is just delete disk blocks that are cached in memory causing an io hit next time same block is needed.
To find out what's going on you need to see what process that uses up all the memory. It is several ways to get that info
ps -e -o pid,vsz,comm= | sort -n -k 2
or to get the arguments also
ps -e -o pid,vsz,command= | sort -n -k 2|cut -b1-$COLUMNS
you can start "top" and hit uppercase "M" to sort by memory user.

Rails + Passenger + Nginx + Dokku 504 after 1 minute of activity

Got a rails app currently running and stable on heroku server (512mb ram)
I took the app as is and put it on dokku (with intercity) on a ubuntu server 14gb ram 2 cpu(azure).
The app spins and works very fast, everything looks fine.
After 1 min of inactivity I refresh the browser and get a
504 Gateway Time-out
I try search for errors or any memory issues but the only thing looks wrong is the
17/01/18 11:24:18 [error] 61198#61198: *2071 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 79.184.17.155, server: cltvf.site, request: "GET /campaigns/5874e4d14bc3600a4a19566/details HTTP/1.1", upstream: "http://172.11.0.3:5000/campaigns/587f4e4d4bc3600a4a19566/details", host: "cltvf.site", referrer: "http://cltv.site/an/u_request_approve"
I got from the
nginx:error-logs
command
the 172.11.0.3 is an internal ip, if helps.
when trying to check if there is a memory issue I saw
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
ac513d4dd4ea 0.00% 199.8 MiB / 13.69 GiB 1.43% 296.7 kB / 156.5 kB 0 B / 0 B 13
a296ec88b1ef 0.01% 254.2 MiB / 13.69 GiB 1.81% 282.5 kB / 111.4 kB 0 B / 614.4 kB 52
beb69ddc4351 0.13% 254.3 MiB / 13.69 GiB 1.81% 286.9 kB / 112.5 kB 0 B / 614.4 kB 51
43665198a31b 0.00% 231.8 MiB / 13.69 GiB 1.65% 19.33 MB / 21.8 MB 0 B / 0 B 12
7d374f36b240 0.00% 231.6 MiB / 13.69 GiB 1.65% 19.34 MB / 21.81 MB 0 B / 0 B 13
04e98f7914b0 0.01% 343.9 MiB / 13.69 GiB 2.45% 14.37 MB / 9.091 MB 0 B / 614.4 kB 51
1255e7837b19 0.20% 231.5 MiB / 13.69 GiB 1.65% 19.34 MB / 21.78 MB 0 B / 0 B 12
378302bbdb84 0.00% 55.11 MiB / 13.69 GiB 0.39% 64.81 kB / 4.737 kB 0 B / 225.3 kB 40
5b8eb7a5423e 0.01% 52.47 MiB / 13.69 GiB 0.37% 71.75 kB / 8.718 kB 0 B / 225.3 kB 40
You can see nothing serious
same for disk usage
dev/sda1 28G 7403M 21G 25.5 [##########............................] /
/dev/sdb1 27G 44M 26G 0.2 [......................................] /mnt
/dev/sda1 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/4631d50385f25bf480fc18f5f2c7d93052b0f2ffecd6d04a14076513344b7338
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/4f8488bdd0a683fda71a6789165d44626215ef4ce00f7d6c70c7ff64d7d89c14
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/553fb1ea82841dd534450e9929513b90d17e4be73e271b861716d8f240ef8d17
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/6909bba1bea70a3781f55bea3d059a014ddae8638021bf4f9a82edffab63cc94
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/7200a36e8f3ca4e9358f83aad1ac5de562068f6458045f291812b8ab9e769abf
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/bd289b0106072a2946e40a60bacb2b1024d1075996aff5bb3388290617ad85b2
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/bd4d4632764af3a8e61b6da8d5f137addc2044615a5a36e72f675a180e6f7c7c
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/e050fcacaeb0d9cb759bc72e768b2ceabd2eb95350f7c9ba6f20933c4696d1ef
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/ffd758a6189aab5eac81950df15779f84f7c93a2a81b1707b082cee2202ece4d
I'm posting this question after hours of googling.
thanks

You could start by checking application logs: dokku logs <app>
You can try to connect directly to curl http://172.11.0.3:5000/
You can try to enter the container that is running the web process: dokku enter <app> web
You can use gdb, strace to connect to the process and use standard linux debugging tools

Most probably the request is taking to long to execute and it fails with 504 Gateway Timeout.
This usually happens when your server response time becomes longer than 60s (depending on the setting in nginx.conf, in dokku 60s is default).
The solutions is:
The easiest: increase the proxy_read_timeout in /home/dokku/:your_app_name/nginx.conf and reload nginx config.
Or find the root cause why your request takes too long to execute and make it respond faster (enable caching for example, or split time expensive tasks in separate workers processes and make web service just respond the status of jobs, allowing frontend to just poll the status of job until its finished)

Phusion passenger has crossed maximum instances limit

I'm using the following for my rails app.
ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-linux]
Rails 3.0.5
Phusion Passenger version 3.0.5
The app sits in a 4GB RAM linux box. I recently upgraded my rails app from 3.0.1 to 3.0.5 for the critical security fix they released last week.
I've been noticing a strange thing. I'm having the following passenger settings in my /etc/apache2/apache2.conf
PassengerMaxPoolSize 10
PassengerMaxInstancesPerApp 5
But there are 18 rack instances spawned by passenger. Its just one app in the server and there is nothing else. App has become slow in response times. I suspect the extra rack instances (coming out of nowhere) is occupying extra memory.
here is my free -m output
total used free shared buffers cached
Mem: 4011 3992 19 0 1 22
-/+ buffers/cache: 3968 43
Swap: 8191 5780 2411
Here is my passenger-status command output and passenger-memory-stats output.
passenger-status:
----------- General information -----------
max = 10
count = 5
active = 1
inactive = 4
Waiting on global queue: 0
----------- Application groups -----------
/home/anand/public_html/railsapp/current:
App root: /home/anand/public_html/railsapp/current
* PID: 6704 Sessions: 0 Processed: 72 Uptime: 9m 58s
* PID: 6696 Sessions: 0 Processed: 99 Uptime: 9m 58s
* PID: 6712 Sessions: 0 Processed: 69 Uptime: 9m 57s
* PID: 6688 Sessions: 0 Processed: 52 Uptime: 9m 58s
* PID: 6677 Sessions: 1 Processed: 83 Uptime: 11m 28s
passenger-memory-stats:
--------- Apache processes ---------
PID PPID VMSize Private Name
------------------------------------
6470 1 95.5 MB 0.3 MB /usr/sbin/apache2 -k start
6471 6470 94.7 MB 0.5 MB /usr/sbin/apache2 -k start
6488 6470 378.4 MB 4.6 MB /usr/sbin/apache2 -k start
6489 6470 378.0 MB 3.8 MB /usr/sbin/apache2 -k start
6774 6470 377.4 MB 3.0 MB /usr/sbin/apache2 -k start
### Processes: 5
### Total private dirty RSS: 12.20 MB
-------- Nginx processes --------
### Processes: 0
### Total private dirty RSS: 0.00 MB
------ Passenger processes ------
PID VMSize Private Name
---------------------------------
6472 87.1 MB 0.2 MB PassengerWatchdog
6475 100.9 MB 3.2 MB PassengerHelperAgent
6477 39.4 MB 4.8 MB Passenger spawn server
6482 70.7 MB 0.6 MB PassengerLoggingAgent
6677 289.1 MB 114.3 MB Rack: /home/anand/public_html/railsapp/current
6684 287.3 MB 17.2 MB Rack: /home/anand/public_html/railsapp/current
6688 295.6 MB 82.4 MB Rack: /home/anand/public_html/railsapp/current
6696 299.2 MB 88.9 MB Rack: /home/anand/public_html/railsapp/current
6704 299.0 MB 87.3 MB Rack: /home/anand/public_html/railsapp/current
6712 312.6 MB 113.3 MB Rack: /home/anand/public_html/railsapp/current
23808 1174.7 MB 190.9 MB Rack: /home/anand/public_html/railsapp/current
26271 1767.0 MB 690.0 MB Rack: /home/anand/public_html/railsapp/current
28888 1584.7 MB 177.8 MB Rack: /home/anand/public_html/railsapp/current
32403 1638.5 MB 230.3 MB Rack: /home/anand/public_html/railsapp/current
32427 1573.6 MB 253.4 MB Rack: /home/anand/public_html/railsapp/current
32443 1576.0 MB 234.7 MB Rack: /home/anand/public_html/railsapp/current
### Processes: 16
### Total private dirty RSS: 2289.34 MB
What is going wrong here? Is Rails 3.0.5 starting up extra extra rack apps. Please help.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

memory leak issue in rails and phusion passenger - ruby-on-rails

Related

embedded Linux RAM usage and confusion

Used memory in /proc/meminfo does not adds up

Gitlab-cl server and memory usage

Rails + Passenger + Nginx + Dokku 504 after 1 minute of activity

Phusion passenger has crossed maximum instances limit

Categories

Resources