Debouncing is a common method to postpone a function/job from executing until after certain time has passed.
Use-case: A conversation with active chatting from multiple users, they should not receive an email notification for each message typed. But more than likely after a few minutes of silence, if the messages are unread, the user should see a notification.
Delayed_Job
Has no solution, has related issues: https://github.com/collectiveidea/delayed_job/issues/72
Sidekiq
https://github.com/hummingbird-me/sidekiq-debounce
Doing yourself is not so bad.
class AdminJob
def self.debounce(job, args={})
handler = YAML.dump(job)
count = Delayed::Job.where(handler: handler).where('locked_at IS NULL').delete_all
Rails.logger.info("deleted: #{count} jobs")
Delayed::Job.enqueue(job, args)
end
end
Instead of writing:
Delayed::Job.enqueue(YourJobName.new(account_id), {run_at: 10.minutes.from_now})
You now write:
AdminJob.debounce(YourJobName.new(account_id), {run_at: 10.minutes.from_now})
Delayed job serializes your job params in YAML and then saves it to the database as handler. So if you call AdminJob.debounce(...) 10 times in a row, it will delete before each.
Make sure to give yourself time (5.minutes, etc) to give users to keep taking actions. If you run your job after 1 second, its likely they'll keep taking actions and trigger again.
Yes i'm answering my own question 3 years later...
What I would do is schedule a periodic task, lets say every 5 minutes that checks if there is someone who has to be notified. Yes it seems an expensive operation, but for your use case I don't see (for now) others solutions. So lets say you have 10 users that uses a chat. Every 5 minutes you could check if there are users that didn't see some messages, and if so you notify them only if they are inactive from N minutes.
To schedule such task you could use crono gem. Check this answer.
Crono lets you do thing like that:
Crono.perform(CheckUsersToBeNotifiedJob).every 5.minutes
If you are using Delayed Job as the implementation for Active Job then take a look at the 'activejob-trackable' gem.
https://github.com/ignatiusreza/activejob-trackable
It uses another table that tracks the jobs and can throttle and debounce.
Related
I'm creating a Rails application that let users book tickets for an event.
Once the user selects the ticket he wants to buy, he has 15 minutes to complete the checkout, otherwise the ticket will be released and can be booked by someone else.
How can I "block" the ticket for 15 minutes and make it available again after 15 minutes?
Have a status field in your ticket model(or whatever model you are using) and mark it as partially_booked when user selects it.
If you are using delayed jobs,
handle_asynchronously :your_method_here, :run_at => Proc.new { 15.minutes.from_now }
If you are using sidekiq,
YourWorker.perform_in(15.minutes, ticket_id)
where 'your_method_here' or 'YourWorker' will contain the logic to check if booking was completed or not, and mark it as unassigned if it was not completed.
ActiveJob helps doing jobs in the background. Using ActiveJob you can generate jobs for your delayed operations and enqueue them for later execution. You can add a boolean field to your model like reserved and make it true when the user first reserves it and run a background job so that if in 15 minutes later user won't complete the purchase, set it back to false:
TicketJob.set(wait_until: 15.minutes.from_now).perform_later(#ticket)
Active Job is a framework for declaring jobs and making them run on a variety of queuing backends. For setting up a backend for ActiveJob, take a look at DelayedJob. It is simple and good enough.
In my Rails 3.2 project, I am using SuckerPunch to run a expensive background task when a model is created/updated.
Users can do different types of interactions on this model. Most of the times these updates are pretty well spaced out, however for some other actions like re-ordering, bulk-updates etc, those POST requests can come in very frequently, and that's when it overwhelms the server.
My question is, what would be the most elegant/smart strategy to start the background job when first update happens, but wait for say 10 seconds to make sure no more updates are coming in to that Model (Table, not a row) and then execute the job. So effectively throttling without queuing.
My sucker_punch worker looks something like this:
class StaticMapWorker
include SuckerPunch::Job
workers 10
def perform(map,markers)
#perform some expensive job
end
end
It gets called from Marker and 'Map' model and sometimes from controllers (for update_all cases)like so:
after_save :generate_static_map_html
def generate_static_map_html
StaticMapWorker.new.async.perform(self.map, self.map.markers)
end
So, a pretty standard setup for running background job. How do I make the job wait or not schedule until there are no updates for x seconds on my Model (or Table)
If it helps, Map has_many Markers so triggering the job with logic that when any marker associations of a map update would be alright too.
What you are looking for is delayed jobs, implemented through ActiveJob's perform_later. According to the edge guides, that isn't implemented in sucker_punch.
ActiveJob::QueueAdapters comparison
Fret not, however, because you can implement it yourself pretty simply. When your job retrieves the job from the queue, first perform some math on the records modified_at timestamp, comparing it to 10 seconds ago. If the model has been modified, simply add the job to the queue and abort gracefully.
code!
As per the example 2/5 of the way down the page, explaining how to add a job within a worker Github sucker punch
class StaticMapWorker
include SuckerPunch::Job
workers 10
def perform(map,markers)
if Map.where(modified_at: 10.seconds.ago..Time.now).count > 0
StaticMapWorker.new.async.perform(map,markers)
else
#perform some expensive job
end
end
end
I have a rails application where I want to run a job in the background, but I need to run the job 2 hours from the original event.
The use case might be something like this:
User posts a product listing.
Background job is queued to syndicate listing to 3rd party api's, but even after original request, the response could take a while and the 3rd party's solution is to poll them every 2 hours to see if we can get a success acknowledgement.
So is there a way to queue a job, so that a worker daemon knows to ignore it or only listen to it at the scheduled time?
I don't want to use cron because it will load up a whole application stack and may be executed twice on overlapping long running jobs.
Can a priority queue be used for this? What solutions are there to implement this?
try delayed job - https://github.com/collectiveidea/delayed_job
something along these lines?
class ProductCheckSyndicateResponseJob < Struct.new(:product_id)
def perform
product = Product.find(product_id)
if product.still_needs_syndicate_response
# do it ...
# still no response, check again in two hours
Delayed::Job.enqueue(ProductCheckSyndicateResponseJob.new(product.id), :run_at => 2.hours.from_now)
else
# nothing to do ...
end
end
end
initialize job first time in controller or maybe before_create callback on model?
Delayed::Job.enqueue(ProductCheckSyndicateResponseJob.new(#product.id), :run_at => 2.hours.from_now)
Use the Rufus Scheduler gem. It runs as a background thread, so you don't have to load the entire application stack again. Add it to your Gemfile, and then your code is as simple as:
# in an initializer,
SCHEDULER = Rufus::Scheduler.start_new
# then wherever you want in your Rails app,
SCHEDULER.in('2h') do
# whatever code you want to run in 2 hours
end
The github page has tons of more examples.
In my Product#create method I have something like
ProductNotificationMailer.notify_product(n.email).deliver
Which fires off if the product gets saved. Now thing is before the above gets fired off, there are bunch of logics and calculations happening which delays the confirmation page load time. Is there a way to make sure the next page loads first and the mail delivery can happen later or in the background?
Thanks
Yes, you'll want to look into background workers. Sidekiq, DelayedJob or Resque are some popular ones.
Here's a great RailsCast demonstrating Sidekiq.
class NotificationWorker
include Sidekiq::Worker
def perform(n_id)
n = N.find(n_id)
ProductNotificationMailer.notify_product(n.email).deliver
end
end
I'm not sure what n was in your example, so I just went with it. Now where you do the work, you can replace it with:
NotificationWorker.perform_async(n.id)
The reason you don't pass full object n as an argument, is because the arguments will be serialized, and it's easier/faster to serialize just the integer id.
Once the jobs is stored, you have a second process running in the background that will do the work, freeing up your web process to immediately go back to rendering the response.
Delayed jobs will do this:
Here is the github page.
and here is a railscast on setting it up.
I want to give my users the option to send them a daily summary of their account statistics at a specific (user given) time ....
Lets say following model:
class DailySummery << ActiveRecord::Base
# attributes:
# send_at
# => 10:00 (hour)
# last_sent_at
# => Time of the last sent summary
end
Is there now a best practice how to send this account summaries via email to the specific time?
At the moment I have a infinite rake task running which checks permanently if emails are available for sending and i would like to put the dailysummary-generation and sending into this rake task.
I had a thought that I could solve this with following pseudo-code:
while true
User.all.each do |u|
u.generate_and_deliver_dailysummery if u.last_sent_at < Time.now - 24.hours
end
sleep 60
end
But I'm not sure if this has some hidden caveats...
Notice: I don't want to use queues like resq or redis or something like that!
EDIT: Added sleep (have it already in my script)
EDIT: It's a time critical service (notification of trade rates) so it should be as fast as possible. Thats the background why I don't want to use a queue or job based system. And I use Monit to manage this rake task, which works really fine.
There's only really two main ways you can do delayed execution. You run the script when an user on your site hits a page, which is inefficient and not entirely accurate. Or use some sort of background process, whether it's a cron job or resque/delayed job/etc.
While your method of having an rake process run forever will work fine, it's inefficient because you're iterating over users 24/7 as soon as it finishes, something like:
while true
User.where("last_sent_at <= ? OR last_sent_at = ?", 24.hours.ago, nil).each do |u|
u.generate_and_deliver_dailysummery
end
sleep 3600
end
Which would run once an hour and only pull users that needed an email sent is a bit more efficient. The best practice would be to use a cronjob though that runs your rake task though.
Running a task periodically is what cron is for. The whenever gem (https://github.com/javan/whenever) makes it simple to configure cron definitions for your app.
As your app scales, you may find that the rake task takes too long to run and that the queue is useful on top of cron scheduling. You can use cron to control when deliveries are scheduled but have them actually executed by a worker pool.
I see two possibilities to do a task at a specific time.
Background process / Worker / ...
It's what you already have done. I refactored your example, because there was two bad things.
Check conditions directly from your database, it's more efficient than loading potential useless data
Load users by batch. Imagine your database contains millions of users... I'm pretty sure you would be happy, but not Rails... not at all. :)
Beside your code I see another problem. How are you going to manage this background job on your production server? If you don't want to use Resque or something else, you should consider manage it another way. There is Monit and God which are both a process monitor.
while true
# Check the condition from your database
users = User.where(['last_sent_at < ? OR created_at IS NULL', 24.hours.ago])
# Load by batch of 1000
users.find_each(:batch_size => 1000) do |u|
u.generate_and_deliver_dailysummery
end
sleep 60
end
Cron jobs / Scheduled task / ...
The second possibility is to schedule your task recursively, for instance each hour or half-hour. Correct me if I'm wrong, but do your users really need to schedule the delivery at 10:39am? I think that let them choose the hour is enough.
Applying this, I think a job fired each hour is better than an infinite task querying your database every single minute. Moreover it's really easy to do, because you don't need to set up anything.
There is a good gem to manage cron task with the ruby syntax. More infos here : Whenever
You can do that, you'll need to also check for the time you want to send at. So starting with your pseudo code and adding to it:
while true
User.all.each do |u|
if u.last_sent_at < Time.now - 24.hours && Time.now.hour >= u.send_at
u.generate_and_deliver_dailysummery
# the next 2 lines are only needed if "generate_and_deliver_dailysummery" doesn't sent last_sent_at already
u.last_sent_at = Time.now
u.save
end
end
sleep 900
end
I've also added the sleep so you don't needlessly hammer your database. You might also want to look into limiting that loop to just the set of users you need to send to. A query similar what Zachary suggests would be much more efficient than what you have.
If you don't want to use a queue - consider delayed job (sort of a poor mans queue) - it does run as a rake task similar to what you are doing
https://github.com/collectiveidea/delayed_job
http://railscasts.com/episodes/171-delayed-job
it stores all tasks in a jobs table, usually when you add a task it queues it to run as soon as possible, however you can override this to delay it until a specific time
you could convert your DailySummary class to DailySummaryJob and once complete it could re-queue a new instance of itself for the next days run
How did you update the last_sent_at attribute?
if you use
last_sent_at += 24.hours
and initialized with last_sent_at = Time.now.at_beginning_of_day + send_at
it will be all ok .
don't use last_sent_at = Time.now . it is because there may be some delay when the job is actually done , this will make the last_sent_at attribute more and more "delayed".