Heroku scheduler rake tasks sidebyside - ruby-on-rails

I have about six Rake tasks I want to run around 4AM every morning. The issue is, they won't run at the same time.
I don't have a worker dyno switched on as I thought this was just wasting money. I'm not 100% sure why this work dyno actually exists.
How do I make the rake tasks all run at the same time? Would switching on the worker dyno make this work?

Having them all run concurrently is tough with just the Heroku scheduler. One dyno and Rake tasks won't do it. You'll need to use a threaded background job of some type. I have used sidekiq the most and like it the best.
There are a couple of moving parts to this but you'll basically need a worker dyno to run sidekiq and you can set the concurrency to 6 and then run them via a cron task such as whenever or clockwork.
I think this is the best way to handle your problem if you truly need them to run at very close to the same time. Exactly the same time isn't going to happen.

Related

How to schedule PhantomJS scraped on free Heroku dynos?

I'm under the impression that free dynos will spin down after a while.
What happens to a script that's running currently with my main ruby server / fires off PhantomJS sraper every now and again?
Do I need a dedicated worker process for this or will Heroku Scheduler do just fine alongside a paid dyno?
I've no issues paying for it, the development always takes a hot second and their workers are a little pricey.
Thanks in advance.
If you want to periodically run a script, Heroku Scheduler is really the ideal way to do this. It'll use one-off dynos, which DO count towards your free dyno allocation each month, but only run during the duration of the task, and stop afterwards.
This is much cheaper, for instance, than running a dedicated worker dyno that is up 24x7, vs a one-off dyno (powered by Heroku Scheduler) which only runs for a few minutes per day.

Sidekiq - Enqueued Job is running from old code

I have about 30 sidekiq jobs scheduled in the future (let's days 1 in a day for the next 30 days).
I use capistrano for deployment. So I have 5 release directories at anytime. Let's say:
/var/www/release1/ (recent)
/var/www/release2/
/var/www/release3/
/var/www/release4/
/var/www/release5/
Let's say after few days, I make a new release. Now, the previously scheduled jobs are still running from the old code. Is this expected? How can we fix this to ensure that it uses the latest release directory when it starts running rather than when it is scheduled?
I'd just like to contribute with an alternate answer for someone who might get into this situation by other reason.
It happened to me that there was a sidekiq zombie process running. So, even if I would stop sidekiq manually and restart it, I had another sidekiq process hanging running with old code. Therefore, it's a good idea to run unix htop command or ps aux | grep sidek and try to look for zombie processes.
This could be because sidekiq process didn't restart after a successful deployment.
Make sure your deployment process restarts sidekiq and make sure restart actually works, otherwise sidekiq processes are still holding on to old code.
https://github.com/mperham/sidekiq/wiki/Deployment

Keeping rake jobs:work running

I'm using delayed_job to run jobs, with new jobs being added every minute by a cronjob.
Currently I have an issue where the rake jobs:work task, currently started with 'nohup rake jobs:work &' manually, is randomly exiting.
While God seems to be a solution to some people, the extra memory overhead is rather annoying and I'd prefer a simpler solution that can be restarted by the deployment script (Capistrano).
Is there some bash/Ruby magic to make this happen, or am I destined to run a monitoring service on my server with some horrid hacks to allow the unprivelaged account the site deploys to the ability to restart it?
I'd suggest you to use foreman. It allows you to start any number of jobs in development by using foreman run, and then export your configuration (number of processes per type, limits etc) as upstart scripts, to make them available to Ubuntu's upstart (why invoking God when the operating system already has this for free??).
The configuration file, Procfile, is also exactly the same file Heroku uses for process configuration, so with just one file you get three process management systems covered.

Running delayed_job worker on Heroku?

So right now I have an implementation of delayed_job that works perfectly on my local development environment. In order to start the worker on my machine, I just run rake jobs:work and it works perfectly.
To get delayed_job to work on heroku, I've been using pretty much the same command: heroku run rake jobs:work. This solution works, without me having to pay anything for worker costs to Heroku, but I have to keep my command prompt window open or else the delayed_job worker stops when I close it. Is there a command to permanently keep this delayed_job worker working even when I close the command window? Or is there another better way to go about this?
I recommend the workless gem to run delayed jobs on heroku. I use this now - it works perfectly for me, zero hassle and at zero cost.
I have also used hirefireapp which gives a much finer degree of control on scaling workers. This costs, but costs less than a single heroku worker (over a month). I don't use this now, but have used it, and it worked very well.
Add
worker: rake jobs:work
to your Procfile.
EDIT:
Even if you run it from your console you 'buy' worker dyno but Heroku has per second biling. So you don't pay because you have 750h free, and month in worst case has 744h, so you have free 6h for your extra dynos, scheduler tasks and so on.
I haven't tried it personally yet, but you might find nohup useful. It allows your process to run even though you have closed your terminal window. Link: http://linux.101hacks.com/unix/nohup-command/
Using heroku console to get workers onto the jobs will only create create a temporary dyno for the job. To keep the jobs running without cli, you need to put the command into the Procfile as #Lucaksz suggested.
After deployment, you also need to scale the dyno formation, as heroku need to know how many dyno should be put onto the process type like this:
heroku ps:scale worker=1
More details can be read here https://devcenter.heroku.com/articles/scaling

Pragmatic ways to monitor Resque queues in Rails

I am looking to automate the starting/restarting of queues with Resque in my Ruby on Rails application. (running on JRuby)
I want to ensure the following criteria are met:
Workers are started after I deploy with capistrano
Workers are restarted if they die for whatever reason
Workers eating too much memory are stopped/restarted and can fire me an email alert
Are there tools that current provide this functionality or at least a subset of it? If there isn't anything that restarts the queue/worker, I would like to be notified at minimum so I can manually do it.
The easiest way to do it would be using a program such as God or Monit to get #2 and #3. For #1, you can just setup your Capistrano script to send a kill -INT to all the Resque workers, then the monitoring program will start them up again.
The advantaged to using kill -INT rather than manually stopping and starting the jobs in the Capistrano script is that your deploy won't have to wait for every worker to stop processing its job to start them back up. It also means if you have a long running job, you will quickly have whatever free workers were running on the new code as quickly as possible.
I'm not especially familiar with it, however I believe the god gem is used frequently for process management.

Resources