MVC4 / Mono 3.2.1 application is running in Debian with Nginx and
mono-fastcgi-4 server. It is started as
MONOSERVER=$(which fastcgi-mono-server45)
WEBAPPS="/:/var/www/html/test/"
${MONOSERVER} /applications=${WEBAPPS} /socket=tcp:127.0.0.1:9000 &
For testing, browser F5 key is hold down for 30 seconds.
After that there is long delay, browser shows page load icon.
After delay message
504 Gateway Time-out
nginx/0.7.67
appears
and top command (output below) shows that mono fastcgi server takes 200% cpu
forever or for a long time (2 cores).
Only way to stop this is to kill mono fastcgi server manually and manually
to restart it
How to make mono fastcgi to return pages immediately and not use so much cpu
?
If same application is hosted with Apache and mod_mono , holding and
releasing F5 key in browser returns
page immediately and cpu usage goes to 0 immediately after F5 is released in
browser.
top - 00:40:38 up 1:43, 3 users, load average: 16.49, 15.92, 15.35
Tasks: 59 total, 1 running, 58 sleeping, 0 stopped, 0 zombie
Cpu(s): 34.5%us, 65.5%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 2097152k total, 744828k used, 1352324k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 120120k cached
PID VIRT RES SHR %CPU %MEM TIME+ COMMAND
4366 500m 121m 21m 198 5.9 6:24.45 /opt/mono-3.2/bin/mono
/opt/mono-3.2/lib/mono/4.5/fastcgi- ....
Update
Answer in
Bad gateway 502 after small load test on fastcgi-mono-server through nginx and ServiceStack
recommends to use same number of threads in nginx and in fastcgi server.
I'm using default nginx and mono fastcgi server configuraton where both probably allow 1024 threads.
Will mono allow actually less threads. Maybe this causes the issue, fastcgi mono server is very old?
Can adding /multiplex to fastcgi mono server fix this?
Is it resonable to decrease number of threads for this not very powerful VPS server above ?
Are there some mono settings which cause the failure ?
Nothing is written to fastcgi log file, how to diagnose this ?
Additional information about this is posted in https://stackoverflow.com/questions/20512978/how-to-limit-mono-197-cpu-usage-in-mono-fastcgi-server
Related
Hard issue happening in production for a long time, we have no clue about where it's coming from. Can sometimes reproduces it on localhost, Heroku Enterprise support has been clue-less about this.
On our production database, we currently have the following setup:
Passenger Standalone, threading disabled, limited to 25 processes MAX. No min setup.
3 web dynos
a SELECT * FROM pg_stat_activity GROUP BY client_addr and count the number of connections per instance shows that more than 1 PSQL connection is opened for one passenger process during our peak days.
Assumptions:
A single address is about a single Dyno (Confirmed by Heroku staff)
Passenger does not spawn more than 25 processes at the time (confirmed with passenger-status during those peaks)
Here is a screenshot of what looks the SELECT * FROM pg_stat_activity;:
In the screenshot, we can see that there are 45 psql connections coming from the same dyno that runs passenger. If we followed our previous logic, it should not have more than 1 connection per Passenger process, so 25.
The logs doesn't look unusual, nothing mentioning either a dyno crash / process crash.
Here is a screenshot of our passenger status for the same dyno (different time, just to prove that there are not more processes than 25 created for one dyno):
And finally one of the response we got from the Heroku support (Amazing support btw)
I have also seen previous reports of Passenger utilising more connections than expected, but most were closed due to difficulty reproducing, unfortunately.
In the Passenger documentation, it's explained that Passenger handle itself the ActiveRecord connections.
Any leads appreciated. Thanks!
Various information:
Ruby Version: 2.4.x
Rails Version: 5.1.x
Passenger Version: 5.3.x
PG Version: 10.x
ActiveRecord Version: 5.1.x
If you need any more info, just let me know in the comments, I will happily update this post.
One last thing: We use ActionCable. I've read somewhere that passenger is handling weirdly the socket connections (Opens a somewhat hidden process to keep the connection alive). This is one of our leads, but so far no luck in reproducing it on localhost. If anyone can confirm how Passenger handles ActionCable connections, it would be much appreciated.
Update 1 (01/10/2018):
Experimented:
Disable NewRelic Auto-Explain feature as explained here: https://devcenter.heroku.com/articles/forked-pg-connections#disabling-new-relic-explain
Run locally a Passenger server with min and max pool size set to 3 (more makes my computer burn), then kill process with various signals (SIGKILL, SIGTERM) to try to see if connections are closed properly. They are.
We finally managed to fix the issue we had on Passenger. We have had this issue for a very long time actually.
The fix
If you use ActionCable, and your default cable route is /cable, then change the Procfile from:
web: bundle exec passenger start -p $PORT --max-pool-size $PASSENGER_MAX_POOL_SIZE
to
web: bundle exec passenger start -p $PORT --max-pool-size $PASSENGER_MAX_POOL_SIZE --unlimited-concurrency-path /cable
Explanation
Before the change, each socket connection (ActionCable) would take one single process in Passenger.
But a Socket is actually something that should not take a whole process. A process can handle many many open socket connection. (Many is more than 10thousands at the same time for some big names). Fortunately, we have much lower socket connections, but still.
After the change, we basically told Passenger to not take a whole process to handle one socket connection, but rather dedicate a whole process to handle all the socket connections.
Documentation
The in-depth documentation on how to do Sockets with Passenger: https://www.phusionpassenger.com/library/config/standalone/tuning_sse_and_websockets/
The flag to pass to Passenger: https://www.phusionpassenger.com/library/config/standalone/reference/#--unlimited-concurrency-path-unlimited_concurrency_paths
Some metrics, after 3 weeks with the fix
Number of forked processes on Passenger dramatically decreased (from 75 processes to ~ 15 processes)
Global memory usage on the web dynos dramatically decreased (related to previous point on forked Passenger processes)
The global number of PSQL connections dramatically decreased and has been steady for two days (even after deployment). (from 150 to ~30 connections)
Number of PSQL connections per dyno dramatically decreased, (from ~50 per dyno to less than 10 per dyno)
The number of Redis connections decreased and has been steady for two days (even after deployment)
Average memory usage on PostgreSQL dramatically decreased and has been steady for two days.
The overall throughput is a bit higher than usual (Throughput is the number of requests handled per minute)
I have 8 core/ 16GB RAM server.
But when i test load on this server. the cpu reached 100% and crash in between process.
my landing page send 250+ HTTP requests/user.
the server is configured with nginx.
please comment required detail, i will edit this post.
I am using Nginx with Phusion Passenger with a single-threaded Rails application. Here's the catch. Within that application, I am using multi-threaded sidekiq to perfrom some background jobs. Typically in my database.yml, I would only need to set the pool value to 1. Here's an example:
default: &default
adapter: mysql2
encoding: utf8
collation: utf8_unicode_ci
pool: 1
username: username
password: password
host: localhost
The reason is because for each tcp socket connection opened, when an http request comes in through that socket, nginx will take the request and pass the information to passenger. Passenger detects its a Rails app, and it spawns a Rails instance, which converts the response to html, which is sent back to nginx, which is then passed back to the client (browser) So for each passenger instance, I will only need one database connection, with a single-threaded Rails app.
But in my sidekiq.yml, I have set concurrency to 5:
:concurrency: 5
This means for each passenger rack instance, I will have 5 concurrent threads handled by sidekiq plus the one connection for the main app, that is a total of 6 database connections for one passenger instance.
When I look at passenger-status, I notice that max_pool_size is set to 6:
----------- General information -----------
Max pool size : 6
So does that mean passenger will never spawn more than 6 Rails instances concurrently? And if that's the case, does that mean my math is correct: 6 (instances) * 6 (database connections: 5 for sidekiq and 1 for main app) = 36 (total database connections possible for my rails app to handle concurrently).
Right now my mysql database is configured to handle 151 max concurrent connections.
SHOW VARIABLES LIKE "max_connections";
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| max_connections | 151 |
+-----------------+-------+
I just want to make sure my math is correct regarding passenger, rails and sidekiq.
First of all, your Sidekiq processes and your web server (in your case, Passenger) are separate. Passenger's thread pool size has no effect on your Sidekiq concurrency; instead, your Sidekiq configuration specifies a separate concurrency. So, we'll consider the two separately:
Passenger
The ActiveRecord database pool value is the number of database connections that your web process will use, in total across all threads. If your Passenger server is set up in multi-process mode, then your max connections from your web processes is db pool size * passenger pool size. On the other hand, if you set it up in multi-threaded mode (which I'd recommend if possible), your max connections is just db pool size (multiplied by however many processes are running; Puma, for example, runs by default two processes with up to fifteen threads or so, so the max connections in that case would be 30).
So, if you're using multi-threaded mode, a pool size of 1 is absolutely not sufficient -- you'll want at least as big a pool as you expect to have threads. In multi-process mode, 1 might work but I doubt it's really worth straying from the default of 5, until you encounter issues.
Sidekiq
Sidekiq always runs in multi-threaded mode (you can technically run multiple processes as well, but I'll assume you aren't). So, like above, you want your connection pool to be at least as big as the number of threads. This might mean that you technically need two different values for your db pool value depending on whether the Rails env is spinning up for Passenger, or for Sidekiq -- see this issue on the Sidekiq repo or this helpful Heroku guide for more information on how to address that.
In summary
Don't forget that, aside from all the above, you may easily have multiple servers all running the same Rails app, but only one database with one connection limit. If you're running Passenger in multi-instance mode with a max of 6 processes, set your db pool size to 5, then each web server node will use up to 30 connections. If it's running a Sidekiq server, then add 5 to that. You will probably not need more than one Sidekiq server, so 4 web nodes # 30 connections + one Sidekiq process # 5 connections = 125 maximum connections, well within your MySQL connection limit.
I reviewed the Passenger documentation again, and while the answer above answers the question, I want to add a little more detail:
HTTP client via TCP sends a request to Nginx
Phusion Passenger loaded into Nginx checks if request should be handled by Passenger. If so, request is sent to Passenger Core.
Passenger core, using load balancing rules, determines which process a request should be forwarded to.
Passenger core also takes care of application spawning: if it determines that having more application processes is necessary or beneficial, then it will make that happen subject to user-configured limits: the core will never spawn more processes than a user-configured maximum.
Passenger core also has monitoring and statistics: passenger-memory-stats and passenger-status
Passenger core restarts an application process if it crashes.
UstRouter sits idle and does not consume resources if you did not configure it to send data to Union Station, a monitoring web service
Watchdog monitors Passenger Core and UstRouter. If either of them crash, they are restarted by the Watchdog.
passenger-memory-stats will verify the three aforementioned processes as well as the spawned rack apps:
------ Passenger processes ------
PID VMSize Private Name
---------------------------------
18355 419.1 MB ? Passenger watchdog
18358 1096.5 MB ? Passenger core
18363 427.2 MB ? Passenger ust-router
18700 818.9 MB 256.2 MB Passenger RubyApp: myapp_rack_rails
24783 686.9 MB 180.2 MB Passenger RubyApp: myapp_rack_rails
passenger-status shows that the max_pool_size is 6. That is, at most there will be 6 rack apps spawned by Passenger Core:
----------- General information -----------
Max pool size : 6
App groups : 2
Processes : 3
As stated in another answer, the ActiveRecord database pool value is the number of database connections that your web process will use, in total across all threads.
But since I am using the free Passenger server, which is set up in multi-process mode, then my max connections from my web processes is db pool size * passenger pool size. So since Passenger pool size is 6, and if my db pool size is 1, that is 6 * 1 = 6. That will be 6 maximum database connections.
Sidekiq always runs in multi-threaded mode.
If someone wants to use sidekiq they must configure the number of threads they want to run on or use the default (25). If they are using a database (likely) then to not hit a connection timeout error they will need to have at least as many connections in their database pool as sidekiq threads. Currently they must configure these two values in two different places, database pool in database.yml for ActiveRecord, and sidekiq connections either via command line or the sidekiq yml file. This is a problem as it is difficult to remember when you are modifying one value that you need to modify both.
We have a popular iPhone app where people duel each other a la Wordfeud. We have almost 1 M registered users today.
During peak hours the app gets really long response times, and there are also quite a lot of time outs. We have tried to find the bottleneck, but have had a hard time doing so.
CPU, memory and I/O are all under 50 % on all servers. The problem ONLY appears during peak hours.
Our setup
1 VPS with nginx (1.1.9) as load balancer
4 front servers with Ruby (1.9.3p194) on Rails (3.2.5) / Unicorn (4.3.1)
1 database server with PostgreSQL 9.1.5
The database logs doesn't show enough long request times to explain all the timeouts shown in the nginx error log.
We have also tried to build and run the app directly against the front servers (during peak hour when all other users are running against the load balancer). The surprising thing is that the app bypassing the load balancer is quick as a bullet even under peak hours.
NGINX SETTINGS
worker_processes=16
worker_connections=4096
multi_accept=on
LINUX SETTINGS
fs.file-max=13184484
net.ipv4.tcp_rmem="4096 87380 4194304"
net.ipv4.tcp_wmem="4096 16384 4194304"
net.ipv4.ip_local_port_range="32768 61000"
Why is the app bypassing the load balancer so fast?
Can nginx as load balancer be the bottle neck?
Is there any good way to compare timeouts in nginx with timeouts in the unicorns to see where the problem resides?
Depending on your settings nginx might be the bottleneck...
Check/tune the following settings in nginx:
the worker_processes setting (should be equal to the number of cores/cpus)
the worker_connections setting (should be very high if you have lots of connections at peak)
set multi_accept on;
if on linux, in nginx make sure you're using epoll (use epoll;-directive)
check/tune the following settings of your OS:
number of allowed open file descriptors (sysctl -w fs.file-max=999999 on linux)
tcp read and write buffers (sysctl -w net.ipv4.tcp_rmem="4096 4096 16777216" and
sysctl - net.ipv4.tcp_wmem="4096 4096 16777216" on linux)
local port range (sysctl -w net.ipv4.ip_local_port_range="1024 65536" on linux)
Update:
so you have 16 workers and 4096 connections per workers
which means a maximum of 4096*16=65536 concurrent connections
you probably have multiple requests per browser (ajax, css, js, page itself, any images on the page, ...), let's say 4 request per browser
that allows for slightly over 16k concurrent users, is that enough for your peaks?
How do you set up your upstream server group and what is the load balancing method you use?
It's hard to imagine that Nginx itself is the bottleneck. Is it possible that some upstream app servers get hit much more than others and start to refuse connection due to backlog is full? See this load balancing issue on Heroku and see if you can get more help there.
After nginx version 1.2.2, nginx provides this least_conn. That might be an easy fix. I haven't tried it myself yet.
Specifies that a group should use a load balancing method where a
request is passed to the server with the least number of active
connections, taking into account weights of servers. If there are
several such servers, they are tried using a weighted round-robin
balancing method.
My RubyOnRails app is set up with the usual pack of mongrels behind Apache configuration. We've noticed that our Mongrel web server memory usage can grow quite large on certain operations and we'd really like to be able to dynamically do a graceful restart of selected Mongrel processes at any time.
However, for reasons I won't go into here it can sometimes be very important that we don't interrupt a Mongrel while it is servicing a request, so I assume a simple process kill isn't the answer.
Ideally, I want to send the Mongrel a signal that says "finish whatever you're doing and then quit before accepting any more connections".
Is there a standard technique or best practice for this?
I've done a little more investigation into the Mongrel source and it turns out that Mongrel installs a signal handler to catch an standard process kill (TERM) and do a graceful shutdown, so I don't need a special procedure after all.
You can see this working from the log output you get when killing a Mongrel while it's processing a request. For example:
** TERM signal received.
Thu Aug 28 00:52:35 +0000 2008: Reaping 2 threads for slow workers because of 'shutdown'
Waiting for 2 requests to finish, could take 60 seconds.Thu Aug 28 00:52:41 +0000 2008: Reaping 2 threads for slow workers because of 'shutdown'
Waiting for 2 requests to finish, could take 60 seconds.Thu Aug 28 00:52:43 +0000 2008 (13051) Rendering layoutfalsecontent_typetext/htmlactionindex within layouts/application
Look at using monit. You can dynamically restart mongrel based on memory or CPU usage. Here's a line from a config file that I wrote for a client of mine.
check process mongrel-8000 with pidfile /var/www/apps/fooapp/current/tmp/pids/mongrel.8000.pid
start program = "/usr/local/bin/mongrel_rails cluster::start --only 8000"
stop program = "/usr/local/bin/mongrel_rails cluster::stop --only 8000"
if totalmem is greater than 150.0 MB for 5 cycles then restart # eating up memory?
if cpu is greater than 50% for 8 cycles then alert # send an email to admin
if cpu is greater than 80% for 5 cycles then restart # hung process?
if loadavg(5min) greater than 10 for 3 cycles then restart # bad, bad, bad
if 3 restarts within 5 cycles then timeout # something is wrong, call the sys-admin
if failed host 192.168.106.53 port 8000 protocol http request /monit_stub
with timeout 10 seconds
then restart
group mongrel
You'd then repeat this configuration for all of your mongrel cluster instances. The monit_stub line is just an empty file that monit tries to download. If it can't, it tries to restart the instance as well.
Note: the resource monitoring seems not to work on OS X with the Darwin kernel.
Better question is how to keep your app from consuming so much memory that it requires you to reboot mongrels from time to time.
www.modrails.com reduced our memory footprint significantly
Boggy:
If you have one process running, it will gracefully shut down (service all the requests in its queue which should only be 1 if you are using proper load balancing). The problem is you can't start the new server until the old one dies, so your users will queue up in the load balancer. What I've found successful is a 'cascade' or rolling restart of the mongrels. Instead of stopping them all and starting them all (therefore queuing requests until the one mongrel is done, stopped, restarted and accepting connections), you can stop then start each mongrel sequentially, blocking the call to restart the next mongrel until the previous one is back up (use a real HTTP check to a /status controller). As your mongrels roll, only one at a time is down and you are serving across two code bases - if you can't do this you should throw up a maintenance page for a minute. You should be able to automate this with capistrano or whatever your deploy tool is.
So I have 3 tasks:
cap:deploy - which does the traditional restart all at the same time method with a hook that puts up a maintenance page and then takes it down after an HTTP check.
cap:deploy:rolling - which does this cascade across the machine (I pull from a iClassify to know how many mongrels are on the given machine) without a maintenance page.
cap deploy:migrations - which does maintenance page + migrations since its usually a bad idea to run migrations 'live'.
Try using:
mongrel_cluster_ctl stop
You can also use:
mongrel_cluster_ctl restart
got a question
what happens when /usr/local/bin/mongrel_rails cluster::start --only 8000 is triggered ?
are all of the requests served by this particular process, to their end ? or are they aborted ?
I curious if this whole start/restart thing can be done without affecting the end users...