Resque: Slow worker startup and Forking - ruby-on-rails

I'm currently moving my application from a Linode setup to EC2. Redis is currently installed on a remote instance with various worker instances interacting with the queue. Thats all going fantastic.
My problem is with the amount of time it takes for a worker to be 'instantiated' and slow forking. Starting a worker will usually take between 30 seconds and a minute(from god.rb starting the worker rake task and the worker actively starting work on the queue). I could live with that, but I've not experienced such a wait time on my current Linode production box so I believe its one of my symptoms to a bigger problem. Next issue is that jobs that took a second or less in my previous environment now seem to take about 5 to 10 times longer..
I'm assuming this must be some sort of issue with my Ubuntu install on EC2? One notable difference is that I'm running REE 1.8.7-2010.01 in my new setup, and REE 1.8.6 on the old Linode boxes.
Anyone else experienced these issues?

It turns out I had overestimated the CPU power of an EC2 small instance. Moved my workers to a large instance and all is well.

Related

Passenger processes stuck maxing CPU after hitting 100%

The Setup:
* Ubuntu 18.04 LTS
* Apache 2.4.29
* Passenger 6.0.16
* Ruby 2.3.8
* Rails 4.2.x
I have both staging and prod servers with the same setup on AWS EC2; they are both running the same kernel/build. I upgraded the Ruby/Rails version of my app from Ruby 2.1.x -> 2.3.8, and Rails 4.0 -> 4.2, first on staging then on production.
On staging, everything was working fine; pages were loading quickly and without issue. On prod, pages would start by loading quickly but pretty soon would degrade. The user CPU would max out at 99%+ eventually causing the app to go down and be unresponsive. The only solution was to restart Apache, roughly every 30min.
After a LOT of digging and testing, top -c showed that Passenger RubyApp would hit 100% CPU and soon after would stay "locked" at max CPU for each process, even if no one was using the site. I've been trying to change different settings both in Apache and Passenger but nothing seems to work. Effectively, as soon as we get a few people hitting the site in a particular way, ANY of the spun Passenger processes that hit 100% end up staying fairly high and either don't shut off or don't exit and burn CPU, as if there were some IO issue.
Right now Passenger and Apache configs are exactly the same on staging/prod and are the defaults.
Screenshots of the example top in prod with a few users using it.
And roughly same amount of people using on staging.
Staging looks far more accurate in terms of a Rails app -- I'd expect to see higher memory use than CPU. AWS Support was also baffled, as prod is on an XL and staging is on a Micro instance, and the AWS kernel versions were the same. Here's AWS monitoring around CPU usage... prod was updated on the 20th, but not a lot of people used it over the weekend, and really became a problem on Monday during working hours.
Any ideas of why this is happening on one server vs the other?? It's no particular request that causes it; it's literally any (or 2-3 requests coming in tandem) that will cause the CPU to spike to 100 and get stuck.
TIA.

Why does the ruby process continue to exist after the foreman finishes?

I'm running two Rails applications (one of which depends on the other (API)) on a local development machine for development.
Applications are relatively fat, with webpack and so on. Therefore, I closed my eyes to the fact that the Macbook Pro turns on the cooling at a sufficiently strong level.
The problem is that the cooling continued to work after I exited the foreman (responsible for starting and running applications). Recently I noticed that this is due to two ruby processes (from two applications), which do not stop working after the foreman finishes. They continue to exist further, and monitoring shows that they load the processor for a couple at 100%.
I'm currently solving the problem like this:
Foreman -> Control + C
spring stop for ruby process
Can you please tell me how can I solve this problem without a spring stop crutch?
UPD
Using overmind there is no such problem. When overmind exits, the ruby process also exits. But using overmind can't run two projects at the same time. So this utility isn't a solution.

Rails 5 app on EC2 keeps shutting down every few days.

We upgraded form a t2.micro to a t2.small EC2 instance when I noticed Rails app was shutting down, and I'd have to restart Unicorn.
Since EC2 doesn't include memory utilization out of the box, I installed perl scripts per AWS docs, and see that we hit 87% memory utilization in the last hour, even though we have a tiny amount of traffic.
What are the main issues that could be causing this?

Unicorn workers spawns too many threads (mem leaks)

I have running RoR site which handled by unicorn. Unicorn master process spawns 10 workers and handle them well, but workers sometimes starts to spawn threads inside and do not kill them.... it leads to memory leaks and server fault.
I solved it by cron script which restarts unicorn every 10 minutes, but its really bad solution. Any ideas?
ScreenProof:
Unicorn (4.6.1) configuration files: https://gist.github.com/907th/4995323
Look into using Monit (http://mmonit.com/monit/) to monitor Unicorn and keep it in check. Watch Ryan Bates' wonderful video on the subject:* http://railscasts.com/episodes/375-monit
*requires a subscription but it's well worth the paltry $9 he's asking.

Can you reload a Rails app on Passenger in the same seamless way as you can reload one on Unicorn?

With Unicorn, you can restart and reload a Rails app with kill -USR2 [master process], which doesn't kill the process immediately, but starts a new master process + slave processes in the background. When the new master is ready, you can shut off the old master with kill -QUIT. This lets you restart your website without having any visitors notice a slowdown in request handling.
But with Passenger, you restart the Rails app with touch tmp/restart.txt, which as far as I can tell, causes the Rails app to become unresponsive for the few seconds it takes to restart the Rails application.
Is there a way to use Passenger, but also have the Rails app restart seamlessly?
Rolling restarts are available in Phusion Passenger Enterprise.
This is the "licensed version" klochner talked about, but it wasn't released until August. Phusion Passenger Enterprise fully automates rolling restarts (Unicorn requires some manual scripting to make rolling restarts behave in a good way). It also includes a bunch of other useful features such as deployment error resistance, live IRB console, etc.
No. [now yes - see hongli's response]
You're asking for rolling restarts, where the new server processes are brought up before the old ones are killed. Passenger (the free version) won't drop requests, but they will get queued and delayed whenever you deploy.
Rolling restarts has supposedly already been implemented and is available in the licensed version, but not yet released for the free version. I've been unable to figure out how to get the licensed version.
Follow this google groups thread for more info:
https://groups.google.com/forum/#!msg/phusion-passenger/hNvU-ZE7_WY/gOF9XWmhHy0J
You could try running two standalone passenger processes and manually bring one down while the other stays up, but I don't think that's the answer you were looking for.

Resources