I have a piece of code that performs the same queries over and over, and it's doing that in a background worker within a thread.
I checkout out the activerecord query cache middleware but apparently it needs to be enabled before use. However I'm not sure if it's a safe thing to do and if it will affect other running threads.
you can see the tests here: https://github.com/rails/rails/blob/3e36db4406beea32772b1db1e9a16cc1e8aea14c/activerecord/test/cases/query_cache_test.rb#L19
my question is: can I borrow and/or use the middleware directly to enable query cache for the duration of a block safely in a thread?
when I tried ActiveRecord::Base.cache do my CI started failing left and right...
EDIT: Rails 5 and later: the ActiveRecord query cache is automatically enabled even for background jobs like Sidekiq (see: https://github.com/mperham/sidekiq/wiki/Problems-and-Troubleshooting#activerecord-query-cache for information on how to disable it).
Rails 4.x and earlier:
The difficulty with applying ActiveRecord::QueryCache to your Sidekiq workers is that, aside from the implementation details of it being a middleware, it's meant to be built during the request and destroyed at the end of it. Since background jobs don't have a request, you need to be careful about when you clear the cache. A reasonable approach would be to cache only during the perform method, though.
So, to implement that, you'll probably need to write your own piece of Sidekiq middleware, based on ActiveRecord::QueryCache but following Sidekiq's middleware guide. E.g.,
class SidekiqQueryCacheMiddleware
def call(worker, job, queue)
connection = ActiveRecord::Base.connection
enabled = connection.query_cache_enabled
connection_id = ActiveRecord::Base.connection_id
connection.enable_query_cache!
yield
ensure
ActiveRecord::Base.connection_id = connection_id
ActiveRecord::Base.connection.clear_query_cache
ActiveRecord::Base.connection.disable_query_cache! unless enabled
end
end
Related
I am developing a Rails app for network automation. Part of app consists logic to run operations, part are operations themselves. Operation is simply a ruby class that performs several commands for network device (router, switch etc).
Right now, operation is simply part of Rails app repo. But in order to make development process more agile, I would like to decouple app and operations. I would have 2 repos - one for app and one for operations. App deploy would follow standard procedure, but operation would sync every time something is pushed to master. And what is more important, I don't want to restart app after operations repo update.
So my question is:
How to exclude several classes (or namespaces) from being cashed in production Rails app - I mean every time I call this class it would be reread file from disk. What could be potential dangers of doing so?
Some code example:
# Example operation - I would like to add or modify such classes withou
class FooOperation < BaseOperation
def perform(host)
conn = new_connection(host) # method from BaseOperation
result = conn.execute("foo")
if result =~ /Error/
# retry, its known bug in device foo
conn.execute("foo")
else
conn.exit
return success # method from BaseOperation
end
end
end
# somewhere in admin panel I would do so:
o = Operations.create(name: "Foo", class_name: "Foo")
o.id # => 123 # for next example
# Ruby worker which actually runs an operation
class OperationWorker
def perform(operation_id, host)
operation = Operation.find(operation_id)
# here, everytime I load this I want ruby to search for implementation on filesystem, never cache
klass = operation.class_name.constantize
class.new(host).perform #
end
end
i think you have quite a misunderstanding about how ruby code loading and interpretation works!
the fact that rails reloads classes at development time is kind of a "hack" to let you iterate on the code while the server has already loaded, parsed and executed parts of your application.
in order to do so, it has to implement quite some magic to unload your code and reload parts of it on change.
so if you want to have up-to-date code when executing an "operation" you are probably best of by spawning a new process. this will guarantee that your new code is read and parsed properly when executed with a blank state.
another thing you can do is use load instead of require because it will actually re-read the source on subsequent requests. you have to keep in mind, that subsequent calls to load just add to the already existing code in the ruby VM. so you need to make sure that every change is compatible with the already loaded code.
this could be circumvented by some clever instance_eval tricks, but i'm not sure that is what you want...
I have a rails application with a dynamically configured time zone. It is stored in a database table containing other options, and the rails application itself is configured to UTC (default).
I've made the application itself aware of the timezone with a simple around filter using Time.use_zone(..., &block).
I would like to do something similar for my Sidekiq workers. Some of them process data that has timezone relevance, so they need it. I don't see any filtering options available in Sidekiq itself, no callbacks, before/after type things I can hook into. My current solution is to a prepend a module, like so:
module TimeZoneAwareWorker
def perform(*args)
Time.use_zone(Options.time_zone) do
super
end
end
end
and mixed in:
class MyWorker
include Sidekiq::Worker
prepend TimeZoneAwareWorker
...
end
This works fine for simple workers, but breaks down if the prepend occurs in the same class as the include Sidekiq::Worker. If the worker is subclassed, the hierarchy doesn't work out for the prepended perform to wrap the implementation.
Is there a better way? Ultimately it seems what I really want is a foolproof method of wrapping a single method with another method, and yielding the wrapped implementation.
I know my other option is monkeypatching before/after/around type callbacks into Sidekiq's implementation, but I'd like to only go there if forced.
Sidekiq has its own middleware solution:
Sidekiq has a similar notion of middleware to Rack: these are small
bits of code that can implement functionality. Sidekiq breaks
middleware into client-side and server-side.
Client-side middleware runs before the pushing of the job to Redis and allows you to modify/stop the job before it gets pushed. Client
middleware may receive the class argument as a Class object or a
String containing the name of the class.
Server-side middleware runs 'around' job processing. Sidekiq's retry feature is implemented as a simple middleware.
You can easily create your own middleware agent to add the timezone awareness code.
I have a rails 3 application and looked around in the internet for daemons but didnt found the right for me..
I want a daemon which fetches data permanently (exchange courses) from a web resource and saves it to the database..
like:
while true
Model.update_attribte(:course, http::get.new("asdasd").response)
end
I've only seen cron like jobs, but they only run after a specific time... I want it permanently, depending on how long it takes to end the query...
Do you understand what i mean?
The gem light-daemon I wrote should work very well in your case.
http://rubygems.org/gems/light-daemon
You can write your code in a class which has a perform method, use a queue system like this and at application startup enqueue the job with Resque.enqueue(Updater).
Obviously the job won't end until the application is stopped, personally I don't like that, but if this is the requirement.
For this reason if you need to execute other tasks you should configure more than one worker process and optionally more than one queue.
If you can edit your requirements and find a trigger for the update mechanism the same approach still works, you only have to remove the while true loop
Sample class needed:
Class Updater
#queue = :endless_queue
def self.perform
while true
Model.update_attribute(:course, http::get.new("asdasd").response)
end
end
end
Finaly i found a cool solution for my problem:
I use the god gem -> http://god.rubyforge.org/
with a bash script (link) for starting / stopping a simple rake task (with an infinite loop in it).
Now it works fine and i have even some monitoring with god running that ensures that the rake task runs ok.
In my rails application, I have a background process runner, model name Worker, that checks for new tasks to run every 10 seconds. This check generates two SQL queries each time - one to look for new jobs, one to delete old completed ones.
The problem with this - the main log file gets spammed for each of those queries.
Can I direct the SQL queries spawned by the Worker model into a separate log file, or at least silence them? Overwriting Worker.logger does not work - it redirects only the messages that explicitly call logger.debug("something").
The simplest and most idiomatic solution
logger.silence do
do_something
end
See Logger#silence
Queries are logged at Adapter level as I demonstrated here.
How do I get the last SQL query performed by ActiveRecord in Ruby on Rails?
You can't change the behavior unless tweaking the Adapter behavior with some really really horrible hacks.
class Worker < ActiveRecord::Base
def run
old_level, self.class.logger.level = self.class.logger.level, Logger::WARN
run_outstanding_jobs
remove_obsolete_jobs
ensure
self.class.logger.level = old_level
end
end
This is a fairly familiar idiom. I've seen it many times, in different situations. Of course, if you didn't know that ActiveRecord::Base.logger can be changed like that, it would have been hard to guess.
One caveat of this solution: this changes the logger level for all of ActiveRecord, ActionController, ActionView, ActionMailer and ActiveResource. This is because there is a single Logger instance shared by all modules.
When a new resource is created and it needs to do some lengthy processing before the resource is ready, how do I send that processing away into the background where it won't hold up the current request or other traffic to my web-app?
in my model:
class User < ActiveRecord::Base
after_save :background_check
protected
def background_check
# check through a list of 10000000000001 mil different
# databases that takes approx one hour :)
if( check_for_record_in_www( self.username ) )
# code that is run after the 1 hour process is finished.
user.update_attribute( :has_record )
end
end
end
You should definitely check out the following Railscasts:
http://railscasts.com/episodes/127-rake-in-background
http://railscasts.com/episodes/128-starling-and-workling
http://railscasts.com/episodes/129-custom-daemon
http://railscasts.com/episodes/366-sidekiq
They explain how to run background processes in Rails in every possible way (with or without a queue ...)
I've just been experimenting with the 'delayed_job' gem because it works with the Heroku hosting platform and it was ridiculously easy to setup!!
Add gem to Gemfile, bundle install, rails g delayed_job, rake db:migrate
Then start a queue handler with;
RAILS_ENV=production script/delayed_job start
Where you have a method call which is your lengthy process i.e
company.send_mail_to_all_users
you change it to;
company.delay.send_mail_to_all_users
Check the full docs on github: https://github.com/collectiveidea/delayed_job
Start a separate process, which is probably most easily done with system, prepending a 'nohup' and appending an '&' to the end of the command you pass it. (Make sure the command is just one string argument, not a list of arguments.)
There are several reasons you want to do it this way, rather than, say, trying to use threads:
Ruby's threads can be a bit tricky when it comes to doing I/O; you have to take care that some things you do don't cause the entire process to block.
If you run a program with a different name, it's easily identifiable in 'ps', so you don't accidently think it's a FastCGI back-end gone wild or something, and kill it.
Really, the process you start should be "deamonized," see the Daemonize class for help.
you ideally want to use an existing background job server, rather than writing your own. these will typically let you submit a job and give it a unique key; you can then use the key to periodically query the jobserver for the status of your job without blocking your webapp. here is a nice roundup of the various options out there.
I like to use backgroundrb, its nice it allows you to communicate to it during long processes. So you can have status updates in your rails app
I think spawn is a great way to fork your process, do some processing in background, and show user just some confirmation that this processing was started.
What about:
def background_check
exec("script/runner check_for_record_in_www.rb #{self.username}") if fork == nil
end
The program "check_for_record_in_www.rb" will then run in another process and will have access to ActiveRecord, being able to access the database.