I'm having big issues trying to get delayed_job working with Amazon S3 and Paperclip. There are a few posts around about how to do it, but for whatever reason it's simply not working for me. I've removed a couple of things to how others are doing it - originally I had a save(validations => false) in regenerate_styles, but that seemed to cause an infinite loop (due to the after save catch), and didn't seem to be necessary (since the URLs have been saved, just the images not uploaded). Here's the relevant code from my model file, submission.rb:
class Submission < ActiveRecord::Base
has_attached_file :photo ...
...
before_photo_post_process do |submission|
if photo_changed?
false
end
end
after_save do |submission|
if submission.photo_changed?
Delayed::Job.enqueue ImageJob.new(submission.id)
end
end
def regenerate_styles!
puts "Processing photo"
self.photo.reprocess!
end
def photo_changed?
self.photo_file_size_changed? ||
self.photo_file_name_changed? ||
self.photo_content_type_changed? ||
self.photo_updated_at_changed?
end
end
And my little ImageJob class that sites at the bottom of the submission.rb file:
class ImageJob < Struct.new(:submission_id)
def perform
Submission.find(self.submission_id).regenerate_styles!
end
end
As far as I can tell, the job itself gets created correctly (as I'm able to pull it out of the database via a query).
The problem arises when:
$ rake jobs:work
WARNING: Nokogiri was built against LibXML version 2.7.8, but has dynamically loaded 2.7.3
[Worker(host:Jarrod-Robins-MacBook.local pid:21738)] New Relic Ruby Agent Monitoring DJ worker host:MacBook.local pid:21738
[Worker(host:MacBook.local pid:21738)] Starting job worker
Processing photo
[Worker(host:MacBook.local pid:21738)] ImageJob completed after 9.5223
[Worker(host:MacBook.local pid:21738)] 1 jobs processed at 0.1045 j/s, 0 failed ...
The rake task then gets stuck and never exits, and the images themselves don't appear to have been reprocessed.
Any ideas?
EDIT: just another point; the same thing happens on heroku, not just locally.
Delayed job is capturing a stack trace for all failed jobs. It’s saved in the last_error column of the delayed_jobs table. Use a database gui too see whats going on.
If you should be using Collective Ideas fork with ActiveRecord as backend you can query the model as usual. To fetch an array of all stack traces for example do
Delayed::Job.where('failed_at IS NOT NULL').map(&:last_error)
By default failed jobs are deleted after 25 failed attempts. It may be that there are no jobs anymore. Prevent deletion for debugging purposes by setting
Delayed::Worker.destroy_failed_jobs = false
in your config/initializers/delayed_job_config.rb
Related
I have a ruby on rails web application deployed on Heroku.
This web app fetches some job feeds of given URLs as XMLs. Then regulates these XMLs and creates a single XML file. It worked pretty well for a while. However, since the #of URLs and job ads increases, it does not work at all. This process sometimes takes up to 45 secs since there are over 35K job vacancies (Heroku sends timeout after 30 secs). I am having an H12 timeout error. This error led me to read this worker dynos and background processing.
I figured out that I should apply the approach below:
Scalable-approach Heroku
Now I am using Redis and Sidekiq on my project. And I am able to create a background worker to do all the dirty work. But here is my question.
Instead of doing this call in the controller class:
def apply
send_data Aggregator.new(providers: providers).call,
type: 'text/xml; charset=UTF-8;',
disposition: 'attachment; filename=indeed_apply_yes.xml'
end
I am doin this perform_async call.
def apply
ReportWorker.perform_async(Time.now)
redirect_to health_path #and returns status 200 ok
end
I implemented this class: ReportWorker calls the Aggregator Service. data_xml is the field that I need to show somewhere or be downloaded automatically when it's ready.
class ReportWorker
include Sidekiq::Worker
sidekiq_options retry: false
data_xml = nil
def perform(start_date)
url_one = 'https://www.examplea.com/abc/download-xml'
url_two = 'https://www.exampleb.com/efg/download-xml'
cursor = 'stop'
providers = [url_one, url_two, cursor]
puts "SIDEKIQ WORKER GENERATING THE XML-DATA AT #{start_date}"
data_xml = Aggregator.new(providers: providers).call
puts "SIDEKIQ WORKER GENERATED THE XML-DATA AT #{Time.now}"
end
end
I know that It's not recommended to make send_data/file methods accessible out of Controller classes. Well, any suggestions on how to do it?
Thanks in advance!!
Do you can set up some database on your application? And then store record about completed jobs there, also you can save the entire file in database, but i recommend some cloud storage (like amazon s3).
And after that you can show current status of queued jobs on some page for user, with button 'download' after job has done
I have an API which uses a Service, in which I have used Ruby thread to reduce the response time of the API. I have tried to share the context using the following example. It was working fine with Rails 4, ruby 2.2.1
Now, we have upgraded rails to 5.2.3 and ruby 2.6.5. After which service has stopped working. I can call the service from Console, it works fine. But with API call, service becomes unresponsive once it reaches CurrencyConverter.new. Any Idea what can be the issue?
class ParallelTest
def initialize
puts "Initialized"
end
def perform
# Our sample set of currencies
currencies = ['ARS','AUD','CAD','CNY','DEM','EUR','GBP','HKD','ILS','INR','USD','XAG','XAU']
# Create an array to keep track of threads
threads = []
currencies.each do |currency|
# Keep track of the child processes as you spawn them
threads << Thread.new do
puts currency
CurrencyConverter.new(currency).print
end
end
# Join on the child processes to allow them to finish
threads.each do |thread|
thread.join
end
{ success: true }
end
end
class CurrencyConverter
def initialize(params)
#curr = params
end
def print
puts #curr
end
end
If I remove the CurrencyConverter.new(currency), then everything works fine. CurrencyConverter is a service object that I have.
Found the Issue
Thanks to #anothermh for this link
https://guides.rubyonrails.org/threading_and_code_execution.html#wrapping-application-code
https://guides.rubyonrails.org/threading_and_code_execution.html#load-interlock
As per the blog, When one thread is performing an autoload by evaluating the class definition from the appropriate file, it is important no other thread encounters a reference to the partially-defined constant.
Only one thread may load or unload at a time, and to do either, it must wait until no other threads are running application code. If a thread is waiting to perform a load, it doesn't prevent other threads from loading (in fact, they'll cooperate, and each perform their queued load in turn, before all resuming running together).
This can be resolved by permitting concurrent loads.
https://guides.rubyonrails.org/threading_and_code_execution.html#permit-concurrent-loads
Rails.application.executor.wrap do
urls.each do |currency|
threads << Thread.new do
CurrencyConverter.new(currency)
puts currency
end
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
threads.map(&:join)
end
end
end
Thank you everybody for your time, I appreciate.
Don't re-invent the wheel and use Sidekiq instead. 😉
From the project's page:
Simple, efficient background processing for Ruby.
Sidekiq uses threads to handle many jobs at the same time in the same process. It does not require Rails but will integrate tightly with Rails to make background processing dead simple.
With 400+ contributors, and 10k+ starts on Github, they have build a solid parallel job execution process that is production ready, and easy to setup.
Have a look at their Getting Started to see it by yourself.
I am working on a Rails 5.x application, and I use Postgres as my database.
I often run rake db:migrate on my production servers. Sometimes the migration will add a new column to the database, and this causes some controller actions to crash with the following error:
ActiveRecord::PreparedStatementCacheExpired: ERROR: cached plan must not change result type
This is happening in a critical controller action that needs to have zero downtime, so I need to find a way to prevent this crash from ever happening.
Should I catch the ActiveRecord::PreparedStatementCacheExpired error and retry the save? Or should I add some locking to this particular controller action, so that I don't start serving any new requests while a database migration is running?
What would be the best way to prevent this crash from ever happening again?
I was able to fix this issue in some places by using this retry_on_expired_cache helper:
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
class << self
# Retry automatically on ActiveRecord::PreparedStatementCacheExpired.
# (Do not use this for transactions with side-effects unless it is acceptable
# for these side-effects to occasionally happen twice.)
def retry_on_expired_cache(*_args)
retried ||= false
yield
rescue ActiveRecord::PreparedStatementCacheExpired
raise if retried
retried = true
retry
end
end
end
I would use it like this:
MyModel.retry_on_expired_cache do
#my_model.save
end
Unfortunately this was like playing "whack-a-mole", because this crash just kept happening all over my application during my rolling deploys (I'm not able to restart all the Rails processes at the same time.)
I finally learned that I can turn off prepared_statements to completely avoid this issue. (See this other question and answers on StackOverflow.)
I was worried about the performance penalty, but I found many reports from people who had set prepared_statements: false, and they hadn't noticed any problems. e.g. https://news.ycombinator.com/item?id=7264171
I created a file at config/initializers/disable_prepared_statements.rb:
db_configuration = ActiveRecord::Base.configurations[Rails.env]
db_configuration.merge!('prepared_statements' => false)
ActiveRecord::Base.establish_connection(db_configuration)
This allows me to continue setting the database configuration from the DATABASE_URL env variable, and 'prepared_statements' => false will be injected into the configuration.
This completely solves the ActiveRecord::PreparedStatementCacheExpired errors and makes it much easier to achieve high-availability for my service while still being able to modify the database.
Sometimes I've got errors in delayed_job worker
NameError: uninitialized constant Notifiers::MessageNotifierJob
full backtrace https://gist.github.com/olegantonyan/eeca9d612f9a10864efe
Notifiers::MessageNotifierJob is defined in app/jobs/notifiers/message_notifier_job.rb
By sometimes I mean that this job may fail -> retry -> succeed. Same thing with another jobs which has a namespace. Jobs without namespace work just fine.
I tried to add app/jobs/ to autoload paths explicitly without any luck
config.autoload_paths += Dir[ Rails.root.join('app', 'jobs', '**/') ]
The job itself looks like this
module Notifiers
class MessageNotifierJob < BaseNotifierJob
def perform(from, to, text)
# some code to send slack notification
end
end
end
Solved. Delayed job or autoloader are not to blame.
A week before adding these new jobs (like Notifiers::MessageNotifierJob) I've increased number of delayed job workers (using capistrano3-delayed-job gem) from 1 to 4. But, capistrano3-delayed-job haven't killed old delayed job process, and only started new 4. So I ended up with 1 old job without any knowledge about my new job classes. Whenever this old process picked the job it failed. Then one of the new processes picked this job and succeeded.
I am building an application where at some point I need to sync a bunch of data from fb with my database, so I am (attemtping) to use Delayed Job to push this into the background. Here is what part of my Delayed Job class looks like.
class FbSyncJob < Struct.new(:user_id)
require 'RsvpHelper'
def perform
user = User.find(user_id)
FbSyncJob.sync_user(user)
end
def FbSyncJob.sync_user(user)
friends = HTTParty.get(
"https://graph.facebook.com/me/friends?access_token=#{user.fb['token']}"
)
friends_list = friends["data"].map { |friend| friend["id"] }
user.fb["friends"] = friends_list
user.fb["sync"]["friends"] = Time.now
user.save!
FbSyncJob.friend_crawl(user)
end
end
With the RsvpHelper class living in lib/RsvpHelper.rb. So at some point in my application I call Delayed::Job.enqueue(FbSyncJob.new(user.id)) with a known valid user. The worker I set up even tells me that the job has been completed successfully:
1 jobs processed at 37.1777 j/s, 0 failed
However when I check the user in the database he has not had his friends list updated. Am I doing something wrong or what? Thanks so much for the help this has been driving me crazy.
Delayed::Job.enqueue will put a record in the delayed job table, but you need to run a seperate process to execute the job code (perform method)
typically in development this would be bundle exec rake jobs:work (NOTE: you must restart this rake task anytime you make code changes, it will not auto load changes)
see https://github.com/collectiveidea/delayed_job#running-jobs
I usually put the following into my delayed configuration while in development - this never puts a record in the delayed job table and runs all background code synchronously (in development) and by default rails will reload changes to your code
Delayed::Worker.delay_jobs = !(Rails.env.test? || Rails.env.development?)
https://github.com/collectiveidea/delayed_job#gory-details (see config/initializers/delayed_job_config.rb example section)