Got a rails app currently running and stable on heroku server (512mb ram)
I took the app as is and put it on dokku (with intercity) on a ubuntu server 14gb ram 2 cpu(azure).
The app spins and works very fast, everything looks fine.
After 1 min of inactivity I refresh the browser and get a
504 Gateway Time-out
I try search for errors or any memory issues but the only thing looks wrong is the
17/01/18 11:24:18 [error] 61198#61198: *2071 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 79.184.17.155, server: cltvf.site, request: "GET /campaigns/5874e4d14bc3600a4a19566/details HTTP/1.1", upstream: "http://172.11.0.3:5000/campaigns/587f4e4d4bc3600a4a19566/details", host: "cltvf.site", referrer: "http://cltv.site/an/u_request_approve"
I got from the
nginx:error-logs
command
the 172.11.0.3 is an internal ip, if helps.
when trying to check if there is a memory issue I saw
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
ac513d4dd4ea 0.00% 199.8 MiB / 13.69 GiB 1.43% 296.7 kB / 156.5 kB 0 B / 0 B 13
a296ec88b1ef 0.01% 254.2 MiB / 13.69 GiB 1.81% 282.5 kB / 111.4 kB 0 B / 614.4 kB 52
beb69ddc4351 0.13% 254.3 MiB / 13.69 GiB 1.81% 286.9 kB / 112.5 kB 0 B / 614.4 kB 51
43665198a31b 0.00% 231.8 MiB / 13.69 GiB 1.65% 19.33 MB / 21.8 MB 0 B / 0 B 12
7d374f36b240 0.00% 231.6 MiB / 13.69 GiB 1.65% 19.34 MB / 21.81 MB 0 B / 0 B 13
04e98f7914b0 0.01% 343.9 MiB / 13.69 GiB 2.45% 14.37 MB / 9.091 MB 0 B / 614.4 kB 51
1255e7837b19 0.20% 231.5 MiB / 13.69 GiB 1.65% 19.34 MB / 21.78 MB 0 B / 0 B 12
378302bbdb84 0.00% 55.11 MiB / 13.69 GiB 0.39% 64.81 kB / 4.737 kB 0 B / 225.3 kB 40
5b8eb7a5423e 0.01% 52.47 MiB / 13.69 GiB 0.37% 71.75 kB / 8.718 kB 0 B / 225.3 kB 40
You can see nothing serious
same for disk usage
dev/sda1 28G 7403M 21G 25.5 [##########............................] /
/dev/sdb1 27G 44M 26G 0.2 [......................................] /mnt
/dev/sda1 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/4631d50385f25bf480fc18f5f2c7d93052b0f2ffecd6d04a14076513344b7338
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/4f8488bdd0a683fda71a6789165d44626215ef4ce00f7d6c70c7ff64d7d89c14
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/553fb1ea82841dd534450e9929513b90d17e4be73e271b861716d8f240ef8d17
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/6909bba1bea70a3781f55bea3d059a014ddae8638021bf4f9a82edffab63cc94
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/7200a36e8f3ca4e9358f83aad1ac5de562068f6458045f291812b8ab9e769abf
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/bd289b0106072a2946e40a60bacb2b1024d1075996aff5bb3388290617ad85b2
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/bd4d4632764af3a8e61b6da8d5f137addc2044615a5a36e72f675a180e6f7c7c
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/e050fcacaeb0d9cb759bc72e768b2ceabd2eb95350f7c9ba6f20933c4696d1ef
none 28G 7403M 21G 25.5 [##########............................] /var/lib/docker/aufs/mnt/ffd758a6189aab5eac81950df15779f84f7c93a2a81b1707b082cee2202ece4d
I'm posting this question after hours of googling.
thanks
You could start by checking application logs: dokku logs <app>
You can try to connect directly to curl http://172.11.0.3:5000/
You can try to enter the container that is running the web process: dokku enter <app> web
You can use gdb, strace to connect to the process and use standard linux debugging tools
Most probably the request is taking to long to execute and it fails with 504 Gateway Timeout.
This usually happens when your server response time becomes longer than 60s (depending on the setting in nginx.conf, in dokku 60s is default).
The solutions is:
The easiest: increase the proxy_read_timeout in /home/dokku/:your_app_name/nginx.conf and reload nginx config.
Or find the root cause why your request takes too long to execute and make it respond faster (enable caching for example, or split time expensive tasks in separate workers processes and make web service just respond the status of jobs, allowing frontend to just poll the status of job until its finished)
Related
A Confluent Kafka instance is running via docker-compose on my Debian 9 server. I followed this tutorial to get it up and running. However, Control Center is shutting down periodically.
sudo docker-compose ps gives the following output:
control-center /etc/confluent/docker/run Exit 1
The rest of the Confluent services stay up and running.
When checking docker logs (sudo docker-compose logs) I can see that it is spamming the following error:
control-center | INFO [Consumer clientId=_confluent-controlcenter-5-3-0-1-9de26cca-62ca-42d6-9d46-86731fc8109a-StreamThread-5-restore-consumer, groupId=null] Unsubscribed all topics or patterns and assigned partitions (org.apache.kafka.clients.consumer.KafkaConsumer)
EDIT: discovered some more logs:
control-center | [2019-08-30 23:10:02,304] INFO [Consumer clientId=_confluent-controlcenter-5-3-0-1-39ae65e2-457c-4696-b592-504fe320038e-StreamThread-3-consumer, groupId=_confluent-controlcenter-5-3-0-1] Group coordinator broker:29092 (id: 2147483646 rack: null) is unavailable or invalid, will attempt rediscovery
control-center | [2019-08-30 22:38:39,291] INFO [Consumer clientId=_confluent-controlcenter-5-3-0-1-39ae65e2-457c-4696-b592-504fe320038e-StreamThread-8-consumer, groupId=_confluent-controlcenter-5-3-0-1] Attempt to heartbeat failed since group is rebalancing (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
EDIT 2: memory available to docker containers:
NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
ksql-datagen 0.00% 3.312MiB / 7.792GiB 0.04% 18.2kB / 2.11kB 92.8MB / 65.5kB 1
control-center 0.00% 0B / 0B 0.00% 0B / 0B 0B / 0B 0
ksql-cli 0.00% 484KiB / 7.792GiB 0.01% 19.6kB / 0B 41kB / 0B 1
ksql-server 0.36% 136MiB / 7.792GiB 1.70% 39.8MB / 34.5MB 210MB / 147kB 30
rest-proxy 0.12% 107.2MiB / 7.792GiB 1.34% 22.2kB / 2.41kB 72.6MB / 81.9kB 28
connect 0.60% 1.571GiB / 7.792GiB 20.16% 124MB / 110MB 1.04GB / 81.9kB 36
schema-registry 0.20% 176.8MiB / 7.792GiB 2.22% 40.2MB / 38.4MB 93.7MB / 156kB 32
broker 7.59% 621MiB / 7.792GiB 7.78% 573MB / 791MB 171MB / 335MB 73
zookeeper 0.10% 80.9MiB / 7.792GiB 1.01% 9.56MB / 8.99MB 38.4MB / 410kB 21
System memory (command: free):
total used free shared buff/cache available
Mem: 8366596096 6770286592 160227328 219533312 1436082176 1099268096
Swap: 34356588544 2301014016 32055574528
Any ideas how to fix this?
This error comes up when the memory allocated is very less.
If you are using Docker Desktop then increase the memory . Go to Docker Desktop->Dashboard->Settings->Preferences->Resources->Advanced and increase the memory and you will be all set.
Thanks for taking the time to read my problem is the following, my auto-escalation policies are associated with a docker container, if the container requires autoscale memonia. In the container the processes (top) our one less load to "docker stats id". There are times when the RAM of the container becomes saturated because the dentry is not live (page cache)
docker stats does not show the actual RAM consumption that the container uses:
docker stats bf257938fa2d 66.54MiB
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O
bf257938fa2d ce88cfdda8f09bc08101 0.00% 66.54MiB / 512MiB 13.00% 1.44MB / 1.26MB 40.3MB / 0B 0
**docker exec -it bf257938fa2d top **
top - 23:24:02 up 53 min, 0 users, load average: 0.01, 0.21, 0.21
Tasks: 6 total, 1 running, 5 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.7%us, 0.3%sy, 0.0%ni, 95.6%id, 0.2%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 15660100k total, 1989516k used, 13670584k free, 95920k buffers
Swap: 0k total, 0k used, 0k free, 1167184k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 11604 2468 2284 S 0.0 0.0 0:00.02 bash
6 root 20 0 309m 12m 7036 S 0.0 0.1 0:00.09 php-fpm
7 root 20 0 59292 7100 6052 S 0.0 0.0 0:00.00 nginx
8 nginx 20 0 59728 4824 3352 S 0.0 0.0 0:00.03 nginx
9 nginx 20 0 59728 4800 3328 S 0.0 0.0 0:00.02 nginx
70 root 20 0 15188 2040 1832 R 0.0 0.0 0:00.02 top
In what way could solve, that RAM consumption is equal in the container (top) and outside the container (docker stats).
Thank you
Today I updated to the newest updated package for Nginx and Passenger. After the update, my app now has a (forking...) process that wasn't there before and doesn't seem to go away. Yet it is taking up memory and sudo /usr/sbin/passenger-memory-stats reports the following.
--------- Nginx processes ----------
PID PPID VMSize Private Name
------------------------------------
1338 1 186.0 MB 0.8 MB nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
1345 1338 186.3 MB 1.1 MB nginx: worker process
### Processes: 2
### Total private dirty RSS: 1.91 MB
---- Passenger processes -----
PID VMSize Private Name
------------------------------
1312 378.8 MB 2.1 MB Passenger watchdog
1320 663.8 MB 4.2 MB Passenger core
1768 211.5 MB 29.0 MB Passenger AppPreloader: /home/ubuntu/my-app
1987 344.1 MB 52.2 MB Passenger AppPreloader: /home/ubuntu/my-app (forking...)
2008 344.2 MB 41.1 MB Passenger AppPreloader: /home/ubuntu/my-app (forking...)
### Processes: 5
### Total private dirty RSS: 128.62 MB
I have the passenger_max_pool_size 2. sudo /usr/sbin/passenger-status reports that two are currently open. The server is receiving no hits at the moment besides me using the site.
Version : 5.3.0
Date : 2018-05-14 00:41:05 +0000
Instance: ql2TTnkw (nginx/1.14.0 Phusion_Passenger/5.3.0)
----------- General information -----------
Max pool size : 2
App groups : 1
Processes : 2
Requests in top-level queue : 0
----------- Application groups -----------
/home/ubuntu/my-app (production):
App root: /home/ubuntu/my-app
Requests in queue: 0
* PID: 1987 Sessions: 0 Processed: 1 Uptime: 3m 36s
CPU: 0% Memory : 52M Last used: 3m 36s ago
* PID: 2008 Sessions: 0 Processed: 1 Uptime: 3m 35s
CPU: 0% Memory : 41M Last used: 3m 35s ago
Passenger never did this before the update and keeps the (forking...) always there now and it seems to have two apps running when it only needs one. I have searched their documents and know when it uses forking and when it doesn't and when it kills app automatically after a certain amount of time. Did they update something with the newest update that I missed in the docs? It seems that 2008 344.2 MB 89.4 MB Passenger AppPreloader: /home/ubuntu/my-app (forking...) always shows now and sometimes even has two of those when before the update I always had the process show without the (forking...).
This is normal for Passenger >= 5.3.
Source: I'm a dev at Phusion who works on Passenger.
The rails site was running so slow that I had to reboot the os, but after only 1 hour after rebooting ubuntu, the system was incredible slow again, so I checked the passenger memory statistics:
------ Passenger processes ------
PID VMSize Private Name
---------------------------------
1076 215.8 MB 0.3 MB PassengerWatchdog
1085 2022.3 MB 4.4 MB PassengerHelperAgent
1089 109.4 MB 6.4 MB Passenger spawn server
1093 159.2 MB 0.8 MB PassengerLoggingAgent
1883 398.1 MB 110.1 MB Rack: /home/guarddog/public_html/guarddog.com/current
1906 1174.6 MB 885.9 MB Rack: /home/guarddog/public_html/guarddog.com/current
3648 370.1 MB 131.9 MB Rack: /home/guarddog/public_html/guarddog.com/current
14830 370.6 MB 83.0 MB Rack: /home/guarddog/public_html/guarddog.com/current
15124 401.2 MB 113.9 MB Rack: /home/guarddog/public_html/guarddog.com/current
15239 413.5 MB 127.7 MB Rack: /home/guarddog/public_html/guarddog.com/current
15277 400.5 MB 113.6 MB Rack: /home/guarddog/public_html/guarddog.com/current
15285 357.1 MB 70.1 MB Rack: /home/guarddog/public_html/guarddog.com/current
### Processes: 12
### Total private dirty RSS: 1648.10 MB
It boggles my mind how that one rack process is using 885.9 MB of private dirty RSS memory after one hour of rebooting the OS when usage was down to 100 mb total. Now one hour later it's at 1648.10 mb. The site is so slow the page won't even load.
I assume it's a memory leak so I added this line of code throughout the application:
puts "Object count: #{ObjectSpace.count_objects}"
But I do not know how to interpret the data it gives me:
Object count: {:TOTAL=>2379635, :FREE=>318247, :T_OBJECT=>35074, :T_CLASS=>6707, :T_MODULE=>1760, :T_FLOAT=>174, :T_STRING=>1777046, :T_REGEXP=>2821, :T_ARRAY=>75160, :T_HASH=>64227, :T_STRUCT=>774, :T_BIGNUM=>7, :T_FILE=>7, :T_DATA=>55075, :T_MATCH=>10, :T_COMPLEX=>1, :T_RATIONAL=>63, :T_NODE=>37652, :T_ICLASS=>4830}
Note I only running that ObjectSpace.count_objects on my local machine since my ubuntu server is so slow it cannot even load the page.
Here's some other OS statistics:
$ cat /proc/meminfo
MemTotal: 6113156 kB
MemFree: 3027204 kB
Buffers: 103540 kB
Cached: 261544 kB
SwapCached: 0 kB
Active: 2641168 kB
Inactive: 248316 kB
Active(anon): 2524652 kB
Inactive(anon): 328 kB
Active(file): 116516 kB
Inactive(file): 247988 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 6287356 kB
SwapFree: 6287356 kB
Dirty: 36 kB
Writeback: 0 kB
AnonPages: 2524444 kB
Mapped: 30108 kB
Shmem: 568 kB
Slab: 77268 kB
SReclaimable: 48528 kB
SUnreclaim: 28740 kB
KernelStack: 4648 kB
PageTables: 43044 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 9343932 kB
Committed_AS: 5179468 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 293056 kB
VmallocChunk: 34359442172 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 46848 kB
DirectMap2M: 5195776 kB
df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/roadrunner-root 134821120 22277596 105695012 18% /
udev 3047064 4 3047060 1% /dev
tmpfs 1222632 252 1222380 1% /run
none 5120 0 5120 0% /run/lock
none 3056576 88 3056488 1% /run/shm
none 102400 0 102400 0% /run/user
/dev/sda1 233191 29079 191671 14% /boot
On a side note, if I run "kill -9 1906" to kill that one rack process consuming so much memory, would that help?
First of all, hot fix the immediate production problem, implement a watchdog - http://dev.mensfeld.pl/2012/08/simple-rubyrails-passenger-memory-consumption-limit-monitoring/ and then hunt for the leak, or bloat (https://blog.engineyard.com/2009/thats-not-a-memory-leak-its-bloat)
That process you showed looks like a regular worker, try killing the offending process under controlled conditions, see what happens, probably nothing bad.
See if you can correlate this happening with a certain (often long running) controller action, or apache restarts/reloads (have this problem on regular basis, 1 in 20 processes goes bonkers after restart).
Extend rails logs so they contain PIDs (https://gist.github.com/krutten/1091611 for example) and make a simple script that dumps memory use into a file every minute or so (make sure you don't fill your disk) - this will enable you to know exactly when a process started bloating and then trace in the logs what it was doing before/as this happened
I'm using the following for my rails app.
ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-linux]
Rails 3.0.5
Phusion Passenger version 3.0.5
The app sits in a 4GB RAM linux box. I recently upgraded my rails app from 3.0.1 to 3.0.5 for the critical security fix they released last week.
I've been noticing a strange thing. I'm having the following passenger settings in my /etc/apache2/apache2.conf
PassengerMaxPoolSize 10
PassengerMaxInstancesPerApp 5
But there are 18 rack instances spawned by passenger. Its just one app in the server and there is nothing else. App has become slow in response times. I suspect the extra rack instances (coming out of nowhere) is occupying extra memory.
here is my free -m output
total used free shared buffers cached
Mem: 4011 3992 19 0 1 22
-/+ buffers/cache: 3968 43
Swap: 8191 5780 2411
Here is my passenger-status command output and passenger-memory-stats output.
passenger-status:
----------- General information -----------
max = 10
count = 5
active = 1
inactive = 4
Waiting on global queue: 0
----------- Application groups -----------
/home/anand/public_html/railsapp/current:
App root: /home/anand/public_html/railsapp/current
* PID: 6704 Sessions: 0 Processed: 72 Uptime: 9m 58s
* PID: 6696 Sessions: 0 Processed: 99 Uptime: 9m 58s
* PID: 6712 Sessions: 0 Processed: 69 Uptime: 9m 57s
* PID: 6688 Sessions: 0 Processed: 52 Uptime: 9m 58s
* PID: 6677 Sessions: 1 Processed: 83 Uptime: 11m 28s
passenger-memory-stats:
--------- Apache processes ---------
PID PPID VMSize Private Name
------------------------------------
6470 1 95.5 MB 0.3 MB /usr/sbin/apache2 -k start
6471 6470 94.7 MB 0.5 MB /usr/sbin/apache2 -k start
6488 6470 378.4 MB 4.6 MB /usr/sbin/apache2 -k start
6489 6470 378.0 MB 3.8 MB /usr/sbin/apache2 -k start
6774 6470 377.4 MB 3.0 MB /usr/sbin/apache2 -k start
### Processes: 5
### Total private dirty RSS: 12.20 MB
-------- Nginx processes --------
### Processes: 0
### Total private dirty RSS: 0.00 MB
------ Passenger processes ------
PID VMSize Private Name
---------------------------------
6472 87.1 MB 0.2 MB PassengerWatchdog
6475 100.9 MB 3.2 MB PassengerHelperAgent
6477 39.4 MB 4.8 MB Passenger spawn server
6482 70.7 MB 0.6 MB PassengerLoggingAgent
6677 289.1 MB 114.3 MB Rack: /home/anand/public_html/railsapp/current
6684 287.3 MB 17.2 MB Rack: /home/anand/public_html/railsapp/current
6688 295.6 MB 82.4 MB Rack: /home/anand/public_html/railsapp/current
6696 299.2 MB 88.9 MB Rack: /home/anand/public_html/railsapp/current
6704 299.0 MB 87.3 MB Rack: /home/anand/public_html/railsapp/current
6712 312.6 MB 113.3 MB Rack: /home/anand/public_html/railsapp/current
23808 1174.7 MB 190.9 MB Rack: /home/anand/public_html/railsapp/current
26271 1767.0 MB 690.0 MB Rack: /home/anand/public_html/railsapp/current
28888 1584.7 MB 177.8 MB Rack: /home/anand/public_html/railsapp/current
32403 1638.5 MB 230.3 MB Rack: /home/anand/public_html/railsapp/current
32427 1573.6 MB 253.4 MB Rack: /home/anand/public_html/railsapp/current
32443 1576.0 MB 234.7 MB Rack: /home/anand/public_html/railsapp/current
### Processes: 16
### Total private dirty RSS: 2289.34 MB
What is going wrong here? Is Rails 3.0.5 starting up extra extra rack apps. Please help.