Best way to run rake task every 2 min on Heroku - ruby-on-rails

I have a rails module that processes some active record objects, only about 15-20 at a time, that I need to start off every two minutes.
I have tried to offload it to sidekiq (and sidekiq-cron), which works, but with the concurrency, created many race conditions and duplicate data.
I really just need a simple rake task cron for rails or maybe sinatra (as I would create a new sinatra app just to complete these tasks)
I either need to force sidekiq to process in a single thread or
have a "cron" job run a rake task or even the module directly
def self.process_events
events = StripeEvent.where(processed: false)
events = StripeServices.arrange_processing_order events
events.each do |event_obj|
StripeServices.new(event_obj).process_event_obj
end
end
thanks for any point in the right direction.
edited
sorry I wasnt very clear. pushing my module to sidekiq caused concurrency issues that I wasnt ready for (my bit of code is not threadsafe), and with the restrictions that Heroku places on "crons", whats the best way to run a rake task every 2 min?
If Sinatra can do it, I would prefer it, but I cant find the solution for that same problem.

It's not clear what are you asking. You already tried option 1, you can try option 2 (create the task and cron it, it's pritty easy) and you'll know better than anyone if it's better.
Anyway, I guess that both methods will have concurrency problems if one task takes more than 2 minutes.
You can add extra flags to prevent two task to process the same ServiceEvent (maybe add a boolean "processing" and set it to true when a task takes it).
Or maybe you can have a lock file to prevent a task to run if another one is already running (you create a file with a specific location and name when the task starts and delete it when it finishes processing, you can check if the file exists before starting a new task).

Related

rails periodic task

I have a ruby on rails app in which I'm trying to find a way to run some code every few seconds.
I've found lots of info and ideas using cron, or cron-like implementations, but these are only accurate down to the minute, and/or require external tools. I want to kick the task off every 15 seconds or so, and I want it to be entirely self contained within the application (if the app stops, the tasks stop, and no external setup).
This is being used for background generation of cache data. Every few seconds, the task will assemble some data, and then store it in a cache which gets used by all the client requests. The task is pretty slow, so it needs to run in the background and not block client requests.
I'm fairly new to ruby, but have a strong perl background, and the way I'd solve this there would be to create an interval timer & handler which forks, runs the code, and then exits when done.
It might be even nicer to just simulate a client request and have the rails controller fork itself. This way I could kick off the task by hitting the URI for it (though since the task will be running every few seconds, I doubt I'll ever need to, but might have future use). Though it would be trivial to just have the controller call whatever method is being called by the periodic task scheduler (once I have one).
I'd suggest the whenever gem https://github.com/javan/whenever
It allows you to specify a schedule like:
every 15.minutes do
MyClass.do_stuff
end
There's no scheduling cron jobs or monkeying with external services.
Generally speaking, there's no built in way that I know of to create a periodic task within the application. Rails is built on Rack and it expects to receive http requests, do something, and then return. So you just have to manage the periodic task externally yourself.
I think given the frequency that you need to run the task, a decent solution could be just to write yourself a simple rake task that loops forever, and to kick it off at the same time that you start your application, using something like Foreman. Foreman is often used like this to manage starting up/shutting down background workers along with their apps. On production, you may want to use something else to manage the processes, like Monit.
You can either write you own method, something like
class MyWorker
def self.work
#do you work
sleep 15
end
end
run it with rails runner MyWorker.work
There will be a separate process running in the background
Or you can use something like Resque, but that's a different approach. It works like that: something adds a task to the queue, meanwhile a worker is fetching whatever job it is in the queue, and tries to finish it.
So that depends on your own need.
I know it is an old question. But maybe for someone this answer could be helpful. There is a gem called crono.
Crono is a time-based background job scheduler daemon (just like Cron) for Ruby on Rails.
Crono is pure Ruby. It doesn't use Unix Cron and other platform-dependent things. So you can use it on all platforms supported by Ruby. It persists job states to your database using Active Record. You have full control of jobs performing process. It's Ruby, so you can understand and modify it to fit your needs.
The awesome thing with crono is that its code is self explained. In order to do a task periodically you can just do:
Crono.perform(YourJob).every 2.days
Maybe you can also do:
Crono.perform(YourJob).every 30.seconds
Anyway you really can do a lot of things. Another example could be:
Crono.perform(TestJob).every 1.week, on: :monday, at: "15:30"
I suggest this gem instead of whenever because whenever uses Unix Cron table which not always is available.
Throwing out a solution just because it looks somewhat elegant and answers the question without any extra gems. In my scenario I wanted to run some code, but only after all my Sidekiq workers were done doing their thing.
First I defined a method to check if any workers were working...
def workers_working?
workers = Sidekiq::Workers.new.map do |_process_id, _thread_id, work|
work
end
workers.size > 0
end
Then we just call the method with a loop which sleeps between calls.
sleep 5 while workers_working?
Use something like delayed job, and requeue it every so often?
Use thin or other server which uses eventmachine, then just use timers that are part of eventmachine. Example: in config/application.rb
EM.add_periodic_timer(2) do
do_this_every_2_sec
end

Some questions about using resque

I am using Resque to run a background process. This is how my background process works:
Scans through all the rows in an ActiveRecord model
Checks for a condition
Updates the row if the condition is met
And this needs to go on infinitely.
This is how I am trying to use Resque for my purpose, here's my worker class:
class ThumbnailMaker
#queue = :thumbnail_queue
def self.perform()
MyObj.check_thumbnails(root_url)
end
end
I understand the perform() method keeps a task in a queue, which is run periodically. In my case, I need a task that scans the whole table, so it runs for a longer time. Is it a good solution to my requirements?
On another note, I need the root url for my Rails application, which is easily obtained with the root_url in Rails Controller. But I need it in a class I have created, can you suggest me how I can get it here?
Resque is for queueing tasks to be run in the background; each item in the queue runs once and then is removed. What you want is more like a scheduled task--for example, a custom Rake task or other script that runs from time to time; there are many scheduling gems available for this kind of thing (wenever is very popular) or just use cron. There is a great RailsCasts episode about this very topic.
You might want to try putting your code in a rake task and running it periodically through a cron job. Resque/Redis seems a bit too much for your needs.
You may consider passing the root url in with as parameter if you are calling your class through your controller. Otherwise, you may want to set it as a ENV setting and configure each of your deployments accordingly.

With a Rails stack, how can I create a background process that handles events by spawning threads that are worked in real time?

With a Rails stack, how can I create a background process that handles events by spawning threads that are worked in real time?
The workers on Heroku pick up jobs every 5 seconds. I need real time. Ideally I'd like to get this working on Heroku, but if I need to, I will move away from it.
This has a long list of background workers: Background Job Manager for Rails 3 but it is not clear if your question heroku specific or not
I think you are looking for something like "run_later" which instead of queueing a job actually returns the request and runs a block in a separate process.
Here is a link to the Rails 3+ version, you can follow the fork network to find many other implementations:
https://github.com/Zelnox/run_later
(I don't use Heroku so I don't know if it runs on it)
Heroku runs rake jobs:work, so you can replace that with your own rake task, either running delayed_job with a shorter than 5 second timeout, or just performing your own task. Probably a good idea to keep a sleep statement in there.
The new cedar stack will run anything you want, so it might be worth checking that out.
With regards to run_later, or spawning from the current request, this does work but if the background process doesn't complete within the 30 second request timeout then heroku will kill it.
I think you need delay_job. please checkout this gem

Regular delayed jobs

I'm using Delayed Job to manage background work.
However I have some tasks that need to be executed at regular interval. Every hour, every day or every week for example.
For now, when I execute the task, I create a new one to be executed in one day/week/month.
However I don't really like it. If for any reason, the task isn't completely executed, we don't create the next one and we might lose the execution of the task.
How do you manage that kind of things (with delayed job) in your rails apps to be sure your regular tasks list remains correct ?
If you have access to Cron, I highly recommend Whenever
http://github.com/javan/whenever
You specify what you want to run and at what frequency in dead simple ruby, and whenever supplies rake tasks to convert this into a crontab and to update your system's crontab.
If you don't have access to frequent cron (like I don't, since we're on Heroku), then DJ is the way to go.
You have a couple options.
Do what you're doing. DJ will retry each task a certain number of times, so you have some leniency there
Put the code that creates the next DJ job in an ensure block, to make sure it gets created even after an exception or other bad event
Create another DJ that runs periodically, checks to make sure the appropriate DJs exist, and creates them if they don't. Of course, this is just as error prone as the other options, since the monitor and the actual DJ are both running in the same env, but it's something.
Is there any particular reason why you wouldn't use cron for this type of things?
Or maybe something more rubyish like rufus-scheduler, which is quite easy to use and very reliable.
If you don't need queuing, these tools are a way to go, I think.

Can I start and stop delayed_job workers from within my Rails app?

I've got an app that could benefit from delayed_job and some background processing. The thing is, I don't really need/want delayed_job workers running all the time.
The app runs in a shared hosting environment and in multiple locations (for different users). Plus, the app doesn't get a large amount of usage.
Is there a way to start and stop processing jobs (either with the script or rake task) from my app only after certain actions/events?
You could call out to system:
system "cd #{Rails.root} && rake delayed_job:start RAILS_ENV=production"
You could just change delayed_job to check less often too. Instead of the 5 second default, set it to 15 minutes or something.
Yes, you can, but I'm not sure what the benefit will be. You say you don't want workers running all the time - what are your concerns? Memory usage? Database connections?
To keep the impact of delayed_job low on your system, I'd run only one worker, and configure it to sleep most of the time.
Delayed::Worker::sleep_delay = 60 * 5 # in your initializer.rb
A single worker will only wake up and check the db for new jobs every 5 minutes. Running this way keeps you from 'customizing' too much.
But if you really want to start a Delayed::Worker programatically, look in that class for work_off, and implement your own script/run_jobs_and_exit script. It should probably look much like script/delayed_job does - 3 lines.
I found this because I was looking for a way to run some background jobs without spending all the money to run them all the time when they weren't needed. Someone made a hack using google app engine to run the background jobs:
http://viatropos.com/blog/how-to-run-background-jobs-on-heroku-for-free/
It's a little outdated though. There is an interesting comment in the thread:
"When I need to send an e-mail, copy a file, etc I basically add it to the queue. At the end of every request it checks if there is anything in the queue. If so then it uses the Heroku API to set the worker to 1. At the end of a worker getting a task done it checks to see if there is anything left in the queue. If not then it sets the workers back to 0. The end result is the background worker will just work for a few seconds here and there. I can do all the background processing that I need and the bill at the end of the month rarely ever reaches 1 hour total worth of work. Even if it does no problem, I'll pay $0.05 for background processing. :)"
If you go to stop a worker, you are given the PID. You can simply kill -9 PID if all else fails.

Resources