Workling processes multiplying uncontrolably - ruby-on-rails

We have a rails app running on passenger and we background process some tasks using a combination of RabbitMQ and Workling. The workling's worker process is started using the script/workling_client command. There is always only one worker process started, and the script/workling_client has a :multiple => false options, thus allowing only one instance. But sometimes, under mysterious circumstances which I haven't been able to track down, more worklings spawn up. If I let the system run for some time, more and more worklings appear. I'm not sure if these rogue worklings cause any problems, but it is still unsettling not to know why is it happening. We are using Monit to monitor the workling process. So if it dies, it will spawn it up again. But this still does not explain how come there are suddenly more than one of them.
So my question is: does anyone know what can be cause of this and how to make it stop? Is it possible that workling sometimes dies by itself, without deleting it's pid file? Could there be something wrong with the Daemons gem workling_client is build upon?

Not an answer - I have the same problems running RabbitMQ + Workling.
I'm using God to monitor the single workling process as well (:multiple => false)...
I found that the multiple worklings were eating up huge amounts of memory & causing serious resource usage, so it's important that I find a solution for this.
You might find this message thread helpful: http://groups.google.com/group/rubyonrails-talk/browse_thread/thread/ed8edd0368066292/5b17d91cc85c3ada?show_docid=5b17d91cc85c3ada&pli=1

Related

running fork in delayed job

we use delayed job in our web application and we need multiple delayed jobs workers happening parallelly, but we don't know how many will be needed.
solution i currently try is running one worker and calling fork/Process.detach inside the needed task.
i was trying to run fork directly in rails application previously but it didnt work too good with passenger.
this solution seems to work well. could there be any caveats in production?
one issue which happened to me today and which anyone trying that should take care of was following:
i noticed that worker is down so i started it. something i didnt think about was that there were 70 jobs waiting in queue. and since processes are forked, they pretty much killed our server for around half an hour by starting all almost immediately and eating all memory in process.. :]
so ensuring that there is god watching over the worker is important.
also worker seems to die often but not sure yet if its connected with forking.

Using god only to kill

I serve my software using passenger. It spawns many ruby processes.
Sometimes one of these rubies becomes bloated and I want it to die.
I was hoping to use god to that intent. My idea was to monitor all these rubies and if it is consuming more than 500MB of memory for 3 cycles, god should try to gracefuly kill it. If it remains alive for more than 5 minutes then god should kill it not gracefully.
It seems to me that god always tries to run the service again, so it forces us to provide a start command. Is it possible to use god only to kill bad behaviored processes and let the passenger spawner to bring them back to live when necessary?
Answer to your question lies in question itself. you can kill ruby processes using god gem which is ruby process process monitor framework by github guys.
basically, here is how it works:
configure god to monitor process it can be anything from apache,passenger,mongrel or just simple file doing a long-running task.
Set conditionals in god's configuration file based upon which god will execute some predefined code.
here is a simple example(taken from docs). consider this as file long running process that runs undefiantly which we want to monitor for memory usage, lets call it simple.rb
loop do
puts 'Hello'
sleep 1
end
now, we install the god gem & configure it to as run as superuser so it can kill/spawn processes and next create a configuration file. example(also taken from docs):
God.watch do |w|
w.name = "simple"
w.start = "ruby /full/path/to/simple.rb"
w.keepalive(:memory_max => 500.megabytes)
end
Here, as you may have got the idea if the process memory usage goes above 500 megabytes, god will restart it. here are few resources that might help, if you are getting started with process management using god gem:
Example gist - Passenger worker monitor to kill workers which use too much RAM(Don't use god, but spawns a new passenger worker instead)
Project Homepage
Github Page
An indepth tutorial using god with rails & passenger
Now, please remember ALL configuration for god is actually legal ruby code so you can get creative & do all sorts of things.
lastly, if you are frequently finding yourself running long running process, I advice you to try JRuby which is works much better with long running processes due to JVM & LOT faster than MRI
I use the same setup on many of my projects and had the same memory leaking issues. After messing around with monitoring, we decided to use the passenger features to tackle it. Specifically it allows the setting (e.g.) PassengerMaxRequests 300 which shuts down any instance when it has served that number of requests.
If you use it, make sure that PassengerMinInstances is set to 0 because it preceedes the setting for max requests.

How do you ensure your Rails server running

What is common approach to make sure that Rails server is auto-restarted after a serious crash, or a process kill? How do you deal with hanging processes? I have nginx and thin running on my production server - would you suggest to put something in between them? Or using another server?
Firstly:
You should identify the cause of a process hang or kill. These are not normal behaviours and indicate a fault somewhere.
Look for:
Insufficient memory or high load before a crash - indicates a configuration problem.
Versions of nginx that are too new.
If you're virtualising, this can cause a number of subtle problems with linux kernels that may cause segfaults. If you're using EC2, use Amazon Linux for your best chance. Ubuntu server is too bleeding edge for this purpose.
In order to do the restarts, I suggest you use monit as this is quick, easy and reliable - it's the normal way to do this.
Lastly, I suggest you set up external monitoring as well using something like Pingdom, as even monit won't catch every type fault, such as hardware failures.
If you only want to monitor an application, I'm always using Nagios with Centreon. You can set email alarming when your rails server is down. You have to setup your NRPE on every machine you want to monitor.
When an error is detected you can run a bash file to kill hanging processes and restart the server automatically. Personally, I never use that because a crash mean something goes wrong. So I do it manually in order to check everything.
Try to look here : http://www.centreon.com/

daemon process killed automatically

Recently I run a below command to start daemon process which runs once in a three days.
RAILS_ENV=production lib/daemons/mailer_ctl start, which was working but when I came back after a week and found that process was killed.
Is this normal or not?
Thanks in advance.
Nope. As far as I understand it, this daemon is supposed to run until you kill it. The idea is for it to work at regular intevals, right? So the daemon is supposed to wake up, do its work, then go back to sleep until needed. So if it was killed, that's not normal.
The question is why was it killed and what can you do about it. The first question is one of the toughest ones to answer when debugging detached processes. Unless your daemon leaves some clues in the logs, you may have trouble finding out when and why it terminated. If you look through your logs (and if you're lucky) there may be some clues -- I'd start right around the time when you suspect it last ran and look at your Rails production.log, any log file the daemon may create but also at the system logs.
Let's assume for a moment that you never can figure out what happened to this daemon. What to do about it becomes an interesting question. The first step is: Log as much as you can without making the logs too bulky or impacting performance. At a minimum log wakeup, processing, and completion events, as well as trapping signals and logging them. Best to log to somewhere other than the Rails production.log. You may also want to run the daemon at a shorter interval than 3 days until you are certain it is stable.
Look into using a process monitoring tool like monit (http://mmonit.com/monit/) or god (http://god.rubyforge.org/). These tools can "watch" the status of daemons and if they are not running can automatically start them. You still need to figure out why they are being killed, but at least you have some safety net.

Rails keeps being rebooted in production Passenger

I'm running an application that kicks off a Rufus Scheduler process in an initializer. The application is running with Passenger in production and I've noticed a couple weird behaviors:
First, in order to restart the server and make sure the initializer gets run, you have to both touch tmp/restart.txt and load the app in a browser. At that point, the initializer fires. The horrible thing is that if you only do the touch, the processes scheduled by Rufus get reset and aren't rescheduled until you load the app in a browser.
This alone I can deal with. But this leads to the second problem: I'll notice that the scheduled process hasn't run, so I load a page and suddenly the log file is telling me that it's running the initializers as if I'd rebooted. So, at some point, Passenger is randomly rebooting as if I'd touched tmp/restart.txt and wiping out my scheduled processes.
I have an incredibly poor understanding of Passenger and Rails's integration, so I don't know whether this occasional rebooting is aberrant or all part of the architecture. Can anyone offer any wisdom on this situation?
What you describe is the way Passenger works. It spawns new instances of the application when traffic warrants them, and shuts them down after periods of inactivity to free resources.
You should read the Passenger documentation, particularly the Resource Control and Optimization section. There are settings which can prevent the application from being shut down by Passenger, if that is what you want.
Using the PassengerPoolIdleTime setting, you could keep at least one process running, but you'll almost certainly want Passenger to start up other instances of the app as necessary. This thread on the Rufus Scheduler Google Group mentions using lock files to prevent more than one process from starting the scheduler, that may be useful to you.

Resources