Delayed job exclude queue - ruby-on-rails

I have a delayed job queue which contains particularly slow running tasks, which I want to be crunched by its own set of dedicated workers, so there is less risk it'll bottleneck the rest of the worker pipeline.
RAILS_ENV=production script/delayed_job --queue=super_slow_stuff start
However I then also want a general worker pool for all other queues, hopefully without having to specify them seperately (as their names etc are often changed/added too). Something akin to:
RAILS_ENV=production script/delayed_job --except-queue=super_slow_stuff start
I could use the wildcard * charecter but I imagine this would cause the second worker to pickup the super slow jobs too?
Any suggestions on this?

you can define a global constant for your app with all queues.
QUEUES={
mailers: 'mailers',
etc..
}
then use this constant in yours delay method calls
object.delay(queue: QUEUES[:mailers]).do_something
and try to build delayed_job_args dinamically
system("RAILS_ENV=production script/delayed_job --pool=super_slow_stuff --pool:#{(QUEUES.values-[super_slow_stuff]).join(',')}:number_of_workers start")

Unfortunately this functionality not realized in delayed jobs.
See:
https://github.com/collectiveidea/delayed_job/pull/466
https://github.com/collectiveidea/delayed_job/pull/901
You may fork delayed jobs repository and apply the simple patches from https://github.com/collectiveidea/delayed_job/pull/466.
Then use your GitHub repo, but please vote into https://github.com/collectiveidea/delayed_job/pull/466 to make it merged finally into upstream.
Update:
I wrote option to exclude queues for myself. It is in (exclude_queues) branch: https://github.com/one-more-alex/delayed_job/tree/exclude_queues
https://github.com/one-more-alex/delayed_job_active_record/tree/exclude_queues
Options description included in Readme.md
Parts about exclusion.
# Option --exclude-specified-queues will do inverse of queues processing by skipping onces from --queue, --queues.
# If both --pool=* --exclude-specified-queues given, no exclusions will by applied on "*".
If EXCLUDE_SPECIFIED_QUEUES set to YES, then queues defined by QUEUE, QUEUES will be skipped instead. See opton --exclude-specified-queues description for specal case of queue "*"
If answer strictly on question, the calling of general worker will be like:
RAILS_ENV=production script/delayed_job --queue=super_slow_stuff --exclude-specified-queues start
Warning
Please not, that am not going to support DelayedJobs and code placed "as is" in hope it will be useful.
Corresponding pull request was made by me https://github.com/collectiveidea/delayed_job/pull/1019
Also for Active Record backend: https://github.com/collectiveidea/delayed_job_active_record/pull/151
Only ActiveRecord backend supported.

Related

Sidekiq ignoring concurrency when using multiple queues

I have two queues: development and ar_updater. I have a bunch of sidekiq workers that are continuously updating an ActiveRecord object, but sometimes they are querying the record at the same time, rather than only after another one has updated it. Therefore, I was thinking about just creating a queue and limiting its concurrency to 1 so that the correct data can be appended.
Here's my config/sidekiq.yml file:
development:
:concurrency: 10
ar_updater:
:concurrency: 1
:queues:
- development
- ar_updater
and then I just have a simple worker that looks like this:
class ArUpdateWorker
include Sidekiq::Worker
sidekiq_options queue: "ar_updater"
def perform(options)
options = options.transform_keys(&:to_sym)
schedule_id = options[:schedule_id]
progress = options[:progress]
schedule = Schedule.find(schedule_id)
old_progress = schedule.current_progress
schedule.update(current_progress: old_progress + progress)
end
end
Is there a way to make sure that these workers are only running one at a time in this particular queue? When calling ArUpdateWorker multiple times, it seems like sidekiq ignores the concurrency and just runs the worker as many times as I want, but if I switch the queue of the worker to development, then it adheres to the concurrency of 10. A little confusing.
It seems the only way to workaround this is to run sidekiq with the concurrency settings set on the command line interface
You cannot actually do that. You have a queue called development. That can be very confusing because you would also have an environment called development.
The configuration you have sets the configuration of the environment called "development" to 10, rather than the queue. A sidekiq process will have one concurrency setting depending on the environment. You can't set that process to have different concurrency levels.
This is a similar question:
How can I specify different concurrency queues for sidekiq configuration?

ActiveRecord::Base.connection.query_cache_enabled in sidekiq

I have a piece of code that performs the same queries over and over, and it's doing that in a background worker within a thread.
I checkout out the activerecord query cache middleware but apparently it needs to be enabled before use. However I'm not sure if it's a safe thing to do and if it will affect other running threads.
you can see the tests here: https://github.com/rails/rails/blob/3e36db4406beea32772b1db1e9a16cc1e8aea14c/activerecord/test/cases/query_cache_test.rb#L19
my question is: can I borrow and/or use the middleware directly to enable query cache for the duration of a block safely in a thread?
when I tried ActiveRecord::Base.cache do my CI started failing left and right...
EDIT: Rails 5 and later: the ActiveRecord query cache is automatically enabled even for background jobs like Sidekiq (see: https://github.com/mperham/sidekiq/wiki/Problems-and-Troubleshooting#activerecord-query-cache for information on how to disable it).
Rails 4.x and earlier:
The difficulty with applying ActiveRecord::QueryCache to your Sidekiq workers is that, aside from the implementation details of it being a middleware, it's meant to be built during the request and destroyed at the end of it. Since background jobs don't have a request, you need to be careful about when you clear the cache. A reasonable approach would be to cache only during the perform method, though.
So, to implement that, you'll probably need to write your own piece of Sidekiq middleware, based on ActiveRecord::QueryCache but following Sidekiq's middleware guide. E.g.,
class SidekiqQueryCacheMiddleware
def call(worker, job, queue)
connection = ActiveRecord::Base.connection
enabled = connection.query_cache_enabled
connection_id = ActiveRecord::Base.connection_id
connection.enable_query_cache!
yield
ensure
ActiveRecord::Base.connection_id = connection_id
ActiveRecord::Base.connection.clear_query_cache
ActiveRecord::Base.connection.disable_query_cache! unless enabled
end
end

celery, ignore task which is past overdue?

When django server is running and celery is not running,
periodic tasks are generated.
When I resume celery, I see the past tasks are run.
Can I mark specific tasks not to be run if past-due?
Looks like expires option of Task.apply_async will work for you.
For example your task may looks like this:
#periodic_task(run_every=timedelta(seconds=15), expires=15)
def update_something():
# do something
It's simple solution.
More customisable solution may be follows. You can save result of your task to cache (e.g. in Redis), and while result of this task is in cache all new tasks are returning just cached value. This solution is very flexible, because you can store all information in cache to decide what to do next (e.g. get value from cache, or rerun task).

Permanent daemon for quering a web resource

I have a rails 3 application and looked around in the internet for daemons but didnt found the right for me..
I want a daemon which fetches data permanently (exchange courses) from a web resource and saves it to the database..
like:
while true
Model.update_attribte(:course, http::get.new("asdasd").response)
end
I've only seen cron like jobs, but they only run after a specific time... I want it permanently, depending on how long it takes to end the query...
Do you understand what i mean?
The gem light-daemon I wrote should work very well in your case.
http://rubygems.org/gems/light-daemon
You can write your code in a class which has a perform method, use a queue system like this and at application startup enqueue the job with Resque.enqueue(Updater).
Obviously the job won't end until the application is stopped, personally I don't like that, but if this is the requirement.
For this reason if you need to execute other tasks you should configure more than one worker process and optionally more than one queue.
If you can edit your requirements and find a trigger for the update mechanism the same approach still works, you only have to remove the while true loop
Sample class needed:
Class Updater
#queue = :endless_queue
def self.perform
while true
Model.update_attribute(:course, http::get.new("asdasd").response)
end
end
end
Finaly i found a cool solution for my problem:
I use the god gem -> http://god.rubyforge.org/
with a bash script (link) for starting / stopping a simple rake task (with an infinite loop in it).
Now it works fine and i have even some monitoring with god running that ensures that the rake task runs ok.

Ruby on Rails: How to run things in the background?

When a new resource is created and it needs to do some lengthy processing before the resource is ready, how do I send that processing away into the background where it won't hold up the current request or other traffic to my web-app?
in my model:
class User < ActiveRecord::Base
after_save :background_check
protected
def background_check
# check through a list of 10000000000001 mil different
# databases that takes approx one hour :)
if( check_for_record_in_www( self.username ) )
# code that is run after the 1 hour process is finished.
user.update_attribute( :has_record )
end
end
end
You should definitely check out the following Railscasts:
http://railscasts.com/episodes/127-rake-in-background
http://railscasts.com/episodes/128-starling-and-workling
http://railscasts.com/episodes/129-custom-daemon
http://railscasts.com/episodes/366-sidekiq
They explain how to run background processes in Rails in every possible way (with or without a queue ...)
I've just been experimenting with the 'delayed_job' gem because it works with the Heroku hosting platform and it was ridiculously easy to setup!!
Add gem to Gemfile, bundle install, rails g delayed_job, rake db:migrate
Then start a queue handler with;
RAILS_ENV=production script/delayed_job start
Where you have a method call which is your lengthy process i.e
company.send_mail_to_all_users
you change it to;
company.delay.send_mail_to_all_users
Check the full docs on github: https://github.com/collectiveidea/delayed_job
Start a separate process, which is probably most easily done with system, prepending a 'nohup' and appending an '&' to the end of the command you pass it. (Make sure the command is just one string argument, not a list of arguments.)
There are several reasons you want to do it this way, rather than, say, trying to use threads:
Ruby's threads can be a bit tricky when it comes to doing I/O; you have to take care that some things you do don't cause the entire process to block.
If you run a program with a different name, it's easily identifiable in 'ps', so you don't accidently think it's a FastCGI back-end gone wild or something, and kill it.
Really, the process you start should be "deamonized," see the Daemonize class for help.
you ideally want to use an existing background job server, rather than writing your own. these will typically let you submit a job and give it a unique key; you can then use the key to periodically query the jobserver for the status of your job without blocking your webapp. here is a nice roundup of the various options out there.
I like to use backgroundrb, its nice it allows you to communicate to it during long processes. So you can have status updates in your rails app
I think spawn is a great way to fork your process, do some processing in background, and show user just some confirmation that this processing was started.
What about:
def background_check
exec("script/runner check_for_record_in_www.rb #{self.username}") if fork == nil
end
The program "check_for_record_in_www.rb" will then run in another process and will have access to ActiveRecord, being able to access the database.

Resources