Make rufus-scheduler work on Heroku - ruby-on-rails

I would like to use Rufus-Scheduler in order to send a mail daily.
I strictly followed the instructions given on GitHub here (the snippet, using Thin server, etc.); but nothnig happened (no mail sent) and I couldn't figured out the reason based on Heroku logs
my code
# config/initializers/scheduler.rb
require 'rufus-scheduler'
s = Rufus::Scheduler.singleton
# Here goes my mailing code (already tested and works well)
s.every '1d' do
Rails.logger.info "hello, it's #{Time.now}"
Rails.logger.flush
end
Is there some other points not mentionned on GitHub in order to make rufus-schedul work ? Many thanks

You can do this, but you have to understand how hobby dynos work. each free account is allocated hours like this.
Accounts are given a base of 550 hours each month in which your Free dynos can run. In addition to these base hours, accounts which verify with a credit card will receive an additional 450 hours of Free dyno quota.
Because rufus-scheduler runs as a thread in your ruby app process (this is why it starts running when you run rails console), as long as you have your server (I use puma) running, rufus scheduler will run just fine.
The downside is that if you run two processes in your server, say you run puma with 3-4 workers, you're going to have 3-4 of your schedulers running at the same time making it execute your scheduled events in triplicate/quadruplicate, so keep that in mind as well.
So the steps are simple
- make sure you have enough hours to run your dyno continuously all month
- use a service like pingdom, to ping your app every couple of minutes to keep the dyno active so that Heroku doesn't spin it down after 30 minutes of inactivity (it does that to free dynos)
- that should be all you need to do
Just remember that to run a dyno for a month you're going to need about 745 hours which your primary allocation covers (when you add a credit card). If for some reason you run out of hours (say you run two different apps on the account and use the method I describe below) then this could happen to you
A second notification will be sent when you reach 100% of your account quota, at which point your application’s dynos will be put to sleep for the remainder of that month. As a result, any apps using free dynos will not be accessible for the remainder of the month.
Seems like a lot of trouble to go to when you can just use the heroku scheduler to schedule rake tasks like everyone else does.

(Disclaimer, I'm not a Heroku expert, but I can google my way around and I can read documentation).
So, it's not your first Heroku - rufus-scheduler question... (Rails_using Rufus in order to schedule sending mails daily)
You say "in order to send a mail daily", so why don't you use the Heroku scheduler addon? https://devcenter.heroku.com/articles/scheduler It can schedule daily and even hourly (ironically, it may miss a schedule, so they recommend the below "custom clock process").
Do you realize that your dyno might be asleep when the time of the schedule comes? Heroku puts the dynos to "sleep" after a certain period of inactivity.
Heroku suggests to use a "custom clock process": https://devcenter.heroku.com/articles/scheduled-jobs-custom-clock-processes#custom-clock-processes
You have to get to know your target platform, Heroku, and adapt your system to it, with or without rufus-scheduler.
As a side-note, your previous post mentions Passenger, which is hard to tune to not kill rufus-scheduler's thread, but that wouldn't play a big role on Heroku where your dyno isn't supposed to live forever, rufus-scheduler can't outlive the dyno of its webapp, hence the "custom clock process" recommendation.

Related

Rail app on Heroku: How to implement a "scheduled job" at regular intervals

I am not sure if this question is of the correct format for SO.
I have a rails app with deployment on Heroku. In development I am using the "crono" gem to send a simple email out every week to remind users of various things. I can see how to use this in production with Heroku. There is almost nothing on this in a google search. Therefore my question is: "what is the best way to implement a simple weekly job in a rails app that is deployed on Heroku. Heroku has "scheduler" but this is a "best effort" add on that can claim to be reliable (according to their documentation)
Thanks
There's two ways to achieve what you want:
1. Use the Heroku scheduler (I would)
Honestly it's just so simple to set up, I would use this. It's also extremely cheap because you only pay for the dyno while the job is running. After it's run Heroku destroys the dyno (it's a one off dyno)
The best way to implement it is to have you background jobs callable by a rake task, and simply have the scheduler call that.
If you have a time interval Heroku doesn't support, simply handle that in your code. For example if you want to send e-mails once a week, have a timestamp to record when the last email was sent. Run the scheduler once a day and just check to see if it's ok to send another email, if not do nothing.
2. Use some kind of scheduler gem
There's a bunch of them out there. For example, rufus.
In this case, you'd write your rufus jobs. You would then need to have a worker dyno always running for rufus, this is much more expensive.
You'd tell the dyno to run rufus and keep running rufus by specifying the command rufus needs in your procfile. Something like:
scheduler: rake rufus:scheduler # Add rake task for rufus scheduler process
(credit for the above snippet How to avoid Rufus scheduler being called as many times as there are no.of dynos?)

Web Dynos in Heroku

I just wanted to see what the best practice in the following situation would be.
I have setup scheduler in my heroku app to run two rake tasks, (performs a screen scrape), these are ran once a day, now from what i have read i have 750 hours free per month of dyno processes but you accrue usage even when the dyno is idle.. So do i need to run
heroku ps:scale web=0
so that the dyno doesnt accrue usage when not running or do i just leave it as it is?
What is the best thing to do here?
Thanks
If you haven't added any more web workers then you should be on the free tier. If you log into your Heroku account and go to the app's dashboard you'll see an estimated monthly cost for resources used, you can double check that it's on $0.
I tested both heroku ps:scale web=0 and heroku ps:scale web=1 on one of my apps. Both leave the cost at $0, and the app is still online even with 0 web workers, so I'm not sure how that works.
You will however pay for the scheduler, for the time it was up to call the rake task. Might be a few dollars per month, or perhaps less than a dollar, depends how long it was up for.

Email notification when 'updated_at' become 2 hours before current time

I'd like to make an email notification if SomeModel has not been updated for 2 hours.
What is the best way to implement it?
After a model has been saved, queue up a background job to run 2 hours from that time to send the email. When a new job is enqueued, remove any still-unrun jobs that are still on the queue.
resque-scheduler providers a pretty simple way of doing this, assuming you have redis up and running.
Personally I find the solution that #x1a4 proposes to be somewhat overkill. Given the relatively large window of 2 hours, I would just run a job periodically (say, once every 10-15 minutes), then search all Models for updated_at <= 2.hours.ago and send out the emails.
As for scheduling that job to run every 15 minutes, there are several options. You may use resque-scheduler, if you are using Resque. You may also use the standard system cron, but will incur some fairly substantial overhead starting Rails each time the job runs. I also have written a distributed scheduler gem (i.e. cron that can run on multiple machines, but act like it's only running on one), which uses Redis under the hood.

How can I monitor recurrent rake tasks run by heroku scheduler?

I just got the last month heroku bill, and the scheduled rake tasks were a relatively heavy burden. We are pretty early in our development process, so we just developed some rake tasks to get the job done recently, and didn't had much concern in theirs optimization.
Now we want to improve theirs performance and theirs heroku processing hours usage. We use New Relic to monitor the webapp performance, but apparently this type of rake tasks are ignored by default, and it's unclear how to override that.
Anyone had a similiar problem? How can I track the scheduled tasks in close to real time to monitor performance, optimize, and don't get suprise bills?
Whilst you can't really monitor rake tasks that well, there are a few little things you can do. One is the use of logging. Output start and end times of tasks to logs, and you can then see what's been happening duration wise. If you couple this with something like the Papertrail add-on then you can do additional interrogation later on.
As for running the jobs themselves, there's a couple of ways that you can run background processes which are dependant on how they need to run:
If you're needing to run jobs on a schedule, there's a few options available. Firstly there's the Heroku scheduler, which is pretty good, but doesn't guarantee executions will happen. Normally you would use this to kick off a rake task which will bring up a one-off dyno for the duration of the task - therefore you need to ensure in development that these tasks are as efficient as possible.
Alternatively, if you're looking at jobs that need a little more control or using a clock process. Essentially this is a dyno running 24/7 that does nothing but kick off other jobs at preset intervals and times. This would normally be done using the clockwork gem. The downside of this approach is that you need to pay for a clock process all the time.
A third approach, and one that might work is delayed job, with it's runat option, allowing you to queue a job to be run in the future (and jobs can re-queue themselves). There are a few issues with this in that a failure can kill the whole chain, and you need a full time worker running to process them all.
Therefore, in order to minimize your bills, ensure that your rake tasks are as performant and reliable, and then choose the scheduling option that suits you. If you're looking at schedules plus user created events, delayed_job might be the best option. If you're looking at a few tasks running periodically, then go scheduler. If you're looking at running lots of time critical jobs on a regular basis, go with clockwork.
Either way, you should be able to constrain a fair amount of processing into just one or two processes depending on your approach.
I know this question is almost 10 years old, but there is a new way!
You can now monitor your Heroku Scheduler jobs using One-off Dyno Metrics. This Heroku add-on gathers metrics for all detached one-off dynos running in your Heroku app. It was created to be an extension of Heroku's Application Metrics and works out of the box.
when you are running on heroku cedar there is a way to get a free setup for your workers. this is no answer to your monitoring question, but it might be interesting anyways: http://blog.nofail.de/2011/07/heroku-cedar-background-jobs-for-free/
You can force the New Relic agent to start in your rake tasks and report their performance data.
Not the answer to the specific question,but...
One method of reducing overhead is using Unicorn server to get multiple workers working on one dyno. It depends on your set up, but most people who've taken the time to test it can comfortably get 3 - 4 worker processes running concurrently. It's a huge boost in clearing cues or tasks. Just be careful not to max out the allocated memory for the dyno.

Can I start and stop delayed_job workers from within my Rails app?

I've got an app that could benefit from delayed_job and some background processing. The thing is, I don't really need/want delayed_job workers running all the time.
The app runs in a shared hosting environment and in multiple locations (for different users). Plus, the app doesn't get a large amount of usage.
Is there a way to start and stop processing jobs (either with the script or rake task) from my app only after certain actions/events?
You could call out to system:
system "cd #{Rails.root} && rake delayed_job:start RAILS_ENV=production"
You could just change delayed_job to check less often too. Instead of the 5 second default, set it to 15 minutes or something.
Yes, you can, but I'm not sure what the benefit will be. You say you don't want workers running all the time - what are your concerns? Memory usage? Database connections?
To keep the impact of delayed_job low on your system, I'd run only one worker, and configure it to sleep most of the time.
Delayed::Worker::sleep_delay = 60 * 5 # in your initializer.rb
A single worker will only wake up and check the db for new jobs every 5 minutes. Running this way keeps you from 'customizing' too much.
But if you really want to start a Delayed::Worker programatically, look in that class for work_off, and implement your own script/run_jobs_and_exit script. It should probably look much like script/delayed_job does - 3 lines.
I found this because I was looking for a way to run some background jobs without spending all the money to run them all the time when they weren't needed. Someone made a hack using google app engine to run the background jobs:
http://viatropos.com/blog/how-to-run-background-jobs-on-heroku-for-free/
It's a little outdated though. There is an interesting comment in the thread:
"When I need to send an e-mail, copy a file, etc I basically add it to the queue. At the end of every request it checks if there is anything in the queue. If so then it uses the Heroku API to set the worker to 1. At the end of a worker getting a task done it checks to see if there is anything left in the queue. If not then it sets the workers back to 0. The end result is the background worker will just work for a few seconds here and there. I can do all the background processing that I need and the bill at the end of the month rarely ever reaches 1 hour total worth of work. Even if it does no problem, I'll pay $0.05 for background processing. :)"
If you go to stop a worker, you are given the PID. You can simply kill -9 PID if all else fails.

Resources