Record progress for long running ActiveJob - ruby-on-rails

Based on this question How to reference active delayed_job within the actual job I'm using Delayed::Job with an additional progress text column to record progress of a long running task.
I'm now trying to update my code to use ActiveJob, so I've replaced def before with before_perform, but the job object passed to before_perform is not the same as the one passed to before. And quite rightly, because the queue adapter is configurable and may not always be :delayed_job.
So, given that the queue adapter is configurable, is there a correct way to access (read and write) the progress column in table delayed_jobs?
Thanks.

Related

Automatically delete all records after 30 days

I have a concept for a rails app that I want to make. I want a model that a user can create a record of with a boolean attribute. After 30 days/Month unless the record has true boolean attribute, the record will automatically delete itself.
In rails 5 you have access to "active_job"(http://guides.rubyonrails.org/active_job_basics.html)
There are two simple ways.
After creating the record, you could set this job to be executed after 30 days. This job checks if the record matches the specifications.
The other alternative is to create an alternative job, that runs everyday which queries the database for every record (of this specific model) that where created 30 days ago and destroy them if they do not match the specifications. (If thats on the database it should be easy as: MyModel.where(created_at: 30.days.ago, destroyable: true).destroy_all)
There are a couple of options for achieving this:
Whenever and run a script or rake task
Clockwork and run a script or rake task
Background jobs which in my opinion is the "rails way".
For the 1,2 you need to check everyday if a record is 30 days old and then delete it if there isn't a true boolean (which means, checking all the records or optimize the query everyday and check only the 30 days old records etc...). For the 3rd option, you can schedule on the record creation, a job to run after 30 days and do the check for each record independently. It depends on how you are processing the jobs, for example, if you use sidekiq you can use scheduled jobs or if you use resque check resque-scheduler.
Performing the deletion is straightforward: create a class method (e.g Record.prune) on the record class in question that performs the deletion based on a query e.g. Record.destroy_all(retain: false) where retain is the boolean attribute you mention. I'd recommend then defining a rake task in lib/tasks that invokes this method (e.g.)
namespace :record
task :prune => :environment
Record.prune
end
end
The scheduling is more difficult; a crontab entry is sufficient to provide the correct timing, but ensuring that an appropriate environment (e.g. one that's loaded rbenv/rvm and any appropriate environment variables) is more difficult. Ensuring that your deployment process produces binstubs is probably helpful here. From there the bin/rake record:prune ought to be enough. It's hard to provide a more in-depth answer without more knowledge of all the environments in which you hope to accomplish this task.
I want to mention a non Rails approach. It depends on your database. When you use mongodb you can utilize mongodb "Expire Data from Collections" feature. When you use mysql you can utilize mysql event scheduler. You can find a good example here What is the best way to delete old rows from MySQL on a rolling basis?

How to create unique delayed jobs

I have a method like this one
def abc
// some stuff here
end
handle_asynchronously :abc, queue: :xyz
I want to create a delayed job for this only if there isn't one already in the queue.
I really feel like this should have an easy solution
Thanks!
I know this post is old but it hasn't been replied.
Delayed jobs does not provide a way to identify jobs. https://github.com/collectiveidea/delayed_job/issues/192
My suggestion is that your job could check if it still has to run when it is executing, for example, comparing to a database value, etc. Inserting jobs in the table should be quick and you might lose that if you start checking for a certain job in the queue.
If you still want to look for duplicates when enqueuing, this might help you.
https://gist.github.com/landovsky/8c505ecab41eb38fa1c2cd23058a6ae3

how to run only a single instance of a worker job using Sidekiq

I have an app that serves up many complex JSON pieces. Some of these could hit the database 1000 times. I pre-cache the results in our db such that when we change the underlying data we regenerate the JSON frag which then gets stored in the DB. I am using Sidekiq to manage this. One issue is that if I put this cache killing code into a Rails Model callback, I could potentially get multiple copies of the worker request at once. This is obviously not what I want. Is there a way in Sidekiq to say only add this to queue if it currently is not in the queue?
Here's an example of what I'm doing:
class ArcBackground
include Sidekiq::Worker
def perform(id)
Menu.regenerateMenuJSON(id)
Menu.regenerateMenuJSONFull(id)
end
end

How to force Rails ActiveRecord to commit a transaction flush

Is it possible to force ActiveRecord to push/flush a transaction (or just a save/create)?
I have a clock worker that creates tasks in the background for several task workers. The problem is, the clock worker will sometimes create a task and push it to a task worker before the clock worker information has been fully flushed to the db which causes an ugly race condition.
Using after_commit isn't really viable due to the architecture of the product and how the tasks are generated.
So in short, I need to be able to have one worker create a task and flush that task to the db.
ActiveRecord uses #transaction to create a block that begins and either rolls back or commits a transaction. I believe that would help your issue. Essentially (presuming Task is an ActiveRecord class):
Task.transaction do
new_task = Task.create(...)
end
BackgroundQueue.enqueue(new_task)
You could also go directly to the #connection underneath with:
Task.connection.commit_db_transaction
That's a bit low-level, though, and you have to be pretty confident about the way the code is being used. #after_commit is the best answer, even if it takes a little rejiggering of the code to make it work. If it won't work for certain, then these two approaches should help.
execute uses async_exec under the hood which may or may not be what you want. You could try using the lower level methods execute_and_clear (or even exec_no_cache) instead.

How do I run delayed job inserts in the backgroud without affecting page load - Rails

I have an RoR application like posting answers to a question. If a user answers to a question, notification messages are sent to all the users, who watch-listed the question, who tracks the question and to the owner of the question. I am using delayed jobs for creating the notification messages. so, While creating answer, there are many inserts into delayed job table going on,which is slowing down the page load. It takes more time to redirect to the question show page after the answer is created.
Currently I am inserting into answers table using AJAX request. Is there any way to insert into delayed jobs table in background after the AJAX request completes?
As we have been trying to say in comments:
It sounds like you have something like:
User.all.each do |user|
user.delay.some_long_operation
end
This ends up inserting a lot of rows into delayed_jobs. What we are suggesting is to refactor that code into the delayed job itself, roughly:
def delayed_operation
User.all.each do |user|
user.some_long_operation
end
end
self.delay.delayed_operation
Obviously, you'll have to adapt that, and probably put the delayed_operation into a model library somewhere, maybe as a class method... but the point is to put the delay call outside the big query and loop.
I really advice doing this like that in a separate process. Why has the user to wait for those meta-actions? Stick to delivering a result page and only notifying your server something has to be done.
Create a separate model PostponedAction to build a list of 'to-do' actions. If you post an answer, add one PostponedAction to this database, with a parameter of the answer id. Then give the results back to the user.
Use a separate process (cron job), to read the PostponedAction items, and handle those. Mark them as 'handled' or delete on succesfull handling. This way, the user is not bugged by slow server processes.
Beside the email jobs you currently have, invent another type of job handling the creation of these jobs.
def email_all
User.all.each do |user|
user.delay.email_one()
end
end
def email_one
# do the emailing
end
self.delay.email_all()
This way the user action only triggers one insert before they see the response. You can also track individual jobs.

Resources