I think i'm having a problem where engineyard is adding a timeout to some of my delayed job workers, (seems to be 10 minutes). I have a copy process that can run for > 10 minutes and everytime it gets to that 10 minutes threshold the job is killed. Is there anyway to configure the engineyard timeout for worker instances?? I'm looking through and all I see is timeouts regarding nginx/apache
There isn't a timeout set for the Delayed Job workers, so this is more likely a memory usage issue. Monit tracks the memory consumed by the workers and will restart those that reach a set threshold. Monit's actions will be logged in /var/log/syslog, so this can be checked to confirm if Monit is terminating the workers. The memory threshold is set in the /etc/monit.d/delayed_job.monitrc file(s) and can be increased to fit the workers' requirements. After alteration of the configuration Monit must be reloaded using sudo monit reload.
If you submit a ticket at https://support.cloud.engineyard.com the support staff will be more than happy to help you further diagnose this issue.
Related
I guess I need a sanity check here because if I want to prevent any sidekiq jobs from ending prematurely, Heroku Redis should handle this for me?
When I want to push new changes to a production site, I put the application in maintenance mode: heroku maintenance:on. Now when I do this and run heroku ps I can see both my web process and my worker (i.e. sidekiq) are up still (makes sense because its just to prevent users having access to the site).
If I shut down the worker dyno with a command like this: heroku ps:stop worker after the site is in maintenance mode, will this safely stop sidekiq workers before it does down? Also, from Sidekiq's documentation:
https://github.com/mperham/sidekiq/wiki/Deployment#heroku
It mentions a -t N switch where N is a number in seconds but that Heroku has a hard limit of allowing a process 30 seconds to shut down on its own. Am I correct that if I stop the worker process with the heroku command, it will give any currently running jobs N seconds to finish before giving it a SIGTERM signal?
If not, what additional steps do I need to take to make sure Sidekiq has safely shut down?
Sounds like you are fine. Heroku sends SIGTERM when you call ps:stop. Sending SIGTERM tells Sidekiq to shut down within N seconds. Your worker dyno should be safely down within 30 seconds.
When applying workers lifetime option with restart, looks like if the worker is running a job, it still moves ahead with restart.
Applied lifetime restart option for every 60 secs using 1 worker and ran a job which simply sleeps for twice the amount of time. The restart still appears to take place even if the worker is running the job.
For graceful restart, thought the worker would wait for a long running task / job to finish and when idle would then restart itself. That way even if you have along running task its not interrupted by the auto restart option.
I ran htop in my production server to see what was eating my RAM. A lot of sidekiq process is running, is this normal?
Press Shift-H. htop shows individual threads as processes by default. There is only one actual sidekiq process.
You seem to have configured it to have 25 workers.
By default, one sidekiq process creates 25 threads.
If that's crushing your machine with I/O, you can adjust it down:
sidekiq -c 10
https://github.com/mperham/sidekiq/wiki/Advanced-Options
If you are not using JRuby then it's likely these all are seperate processes that consume memory.
I'm under the impression that free dynos will spin down after a while.
What happens to a script that's running currently with my main ruby server / fires off PhantomJS sraper every now and again?
Do I need a dedicated worker process for this or will Heroku Scheduler do just fine alongside a paid dyno?
I've no issues paying for it, the development always takes a hot second and their workers are a little pricey.
Thanks in advance.
If you want to periodically run a script, Heroku Scheduler is really the ideal way to do this. It'll use one-off dynos, which DO count towards your free dyno allocation each month, but only run during the duration of the task, and stop afterwards.
This is much cheaper, for instance, than running a dedicated worker dyno that is up 24x7, vs a one-off dyno (powered by Heroku Scheduler) which only runs for a few minutes per day.
A couple of days ago I noticed a strange thing - from time to time server stops processing request for some time. At the top output it looks like this:
ten Unicorn workers process requests;
then, for some reason, they stop doing anything. I mean, all ten workers have 'sleeping' status;
for a ten-fifteen seconds they sleep;
and then suddenly all then workers at the same time start processing requests (lots of them were queued for 10s);
I have the following setup:
nginx, unicorn 4.6.2, postgres, redis for sessions and cache, MRI ruby 2.0.0p353.
My first thought was to blame redid (because if redis doesn't give sessions, all process will wait for it), but it seems it is not the case, because while unicorn workers freeze, redis serving other processes that do background jobs.
I don't understand what is the reason of this strange behaviour.
If someone have some thoughts on the matter I would gladly check it. If you need additional information - just tell me what to do, and I'll try to provide it.
UPDATE:
Unicorn config
strace on unicorn worker
strace on unicorn master
strace on nginx
It turned out (with the help of strace on worker processes) workers were trying to write logs on the disk. Disk was heavy loaded and processes were blocked.