Testing ActiveJob with Minitest doesn't hit Sidekiq queue - ruby-on-rails

On Rails 4.2 I have the following ActiveJob test :
test/jobs/import_job_test.rb
require 'test_helper'
class ImportJobTest < ActiveJob::TestCase
def setup
#response = ImportJob.perform_later "'testing Sidekiq queue jobs'"
end
test "enqueued jobs" do
assert_enqueued_jobs 1
clear_enqueued_jobs
assert_enqueued_jobs 0
end
test "ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper" do
assert_equal ["'testing Sidekiq queue jobs'"], #response.arguments
end
test "a second new job has been enqueued with the given arguments" do
assert_enqueued_jobs 1
assert_enqueued_with(job: ImportJob, args: ["'queuing a second job'"], queue: 'default') do
ImportJob.perform_later "'queuing a second job'"
end
assert_enqueued_jobs 2
end
end
Running the test it goes green:
$ rake test test/jobs/import_job_test.rb
Started with run options --seed 35322
4/4: [===================================] 100% Time: 00:00:00, Time: 00:00:00
Finished in 0.01380s
4 tests, 7 assertions, 0 failures, 0 errors, 0 skips
but never touch the Sidekiq queue really. I get green also when Sidekiq is turned off, which I don't want. Of course running in console the queue is bitten.
How can I specify to REALLY hit the queue in test mode ?

The reason for using an API like ActiveJob is to allow you to write your code to an abstract API so you can change adapters. In other words, your ActiveJob classes should be able to move from Sidekiq to Que without making any changes to your code. Because of this, ActiveJob::TestCase uses a test adapter that makes asserting job behavior easier.
That said, if you really want to make your jobs hit a running queue, you should configure your test environment accordingly and inherit from a test class that doesn't use ActiveJob::TestHelper.

Related

Unexpected sidekiq jobs get executed

I'm using sidekiq cron to run some jobs. I have a parent job which only runs once, and that parent job starts 7 million child jobs. However, in my sidekiq dashboard, it says over 42 million jobs enqueued. I checked those enqueued jobs, they are my child jobs. I'm trying to figure out why so many more jobs than expected are enqueued. I checked the log in sidekiq, one thing I noticed is, "Cron Jobs - add job with name: new_topic_post_job" shows up many times in the log. new_topic_post is the name of the parent job in schedule.yml. Following lines also show up many times
2019-04-18T17:01:22.558Z 12605 TID-osb3infd0 WARN: Processing recovered job from queue queue:low (queue:low_i-03933b94d1503fec0.nodemodo.com_4): "{\"retry\":false,\"queue\":\"low\",\"backtrace\":true,\"class\":\"WeeklyNewTopicPostCron\",\"args\":[],\"jid\":\"f37382211fcbd4b335ce6c85\",\"created_at\":1555606809.2025042,\"locale\":\"en\",\"enqueued_at\":1555606809.202564}"
2019-04-18T17:01:22.559Z 12605 TID-osb2wh8to WeeklyNewTopicPostCron JID-f37382211fcbd4b335ce6c85 INFO: start
WeeklyNewTopicPostCron is the name of the parent job class. Wondering does this mean my parent job runs multiple times instead of only 1? If so, what's the cause? I'm pretty sure the time in the cron job is right, I set it to "0 17 * * 4" which means it only runs once a week. Also I set retry to false for parent job and 3 for child jobs. So even all child jobs fail, we should still only have 21 million jobs. Following is my cron job setting in schedule.yml
new_topic_post_job:
cron: "0 17 * * 4"
class: "WeeklyNewTopicPostCron"
queue: low
and this is WeeklyNewTopicPostCron:
class WeeklyNewTopicPostCron
include Sidekiq::Worker
sidekiq_options queue: :low, retry: false, backtrace: true
def perform
processed_user_ids = Set.new
TopicFollower.select("id, user_id").find_in_batches(batch_size: 1000000) do |topic_followers|
new_user_ids = []
topic_followers.map(&:user_id).each { |user_id| new_user_ids << user_id if processed_user_ids.add?(user_id) }
batch_size = 1000
offset = 0
loop do
batched_user_ids_for_redis = new_user_ids[offset, batch_size]
Sidekiq::Client.push_bulk('class' => NewTopicPostSender,
'args' => batched_user_ids_for_redis.map { |user_id| [user_id, 7] }) if batched_user_ids_for_redis.present?
break if batched_user_ids_for_redis.size < batch_size
offset += batch_size
end
end
end
end
Most probably your parent sidekiq job is causing the sidekiq process to crash, which then results in a worker restart. On restart sidekiq probably tries to recover the interrupted job and starts processing it again (from the beginning). Some details here:
https://github.com/mperham/sidekiq/wiki/Reliability#recovering-jobs
This probably happens multiple times before the parent job eventually finishes, and hence the extremely high number of child jobs are created. You can easily verify this by checking the process id of the sidekiq process while this job is being run and it most probably will keep changing after a while:
ps aux | grep sidekiq
It could be that you have some monit configuration to restart sidekiq in case memory usage goes too high.Or it might be that this query is causing the process to crash:
TopicFollower.select("id, user_id").find_in_batches(batch_size: 1000000)
Try reducing the batch_size. 1million feels like too high a number. But my best guess is that the sidekiq process dies while processing the long running parent process.

How to run a cron job every 10 seconds in ruby on rails

I am trying to run a cron job in every 10 seconds that runs a piece of code. I have used an approach which requires running a code and making it sleep for 10 seconds, but it seems to make drastically degrading the app performance. I am using whenever gem, which run every minute and sleeps for 10 seconds. How can I achieve the same w/o using sleep method. Following is my code.
every 1.minute do
runner "DailyNotificationChecker.send_notifications"
end
class DailyNotificationChecker
def self.send_notifications
puts "Triggered send_notifications"
expiry_time = Time.now + 57
while (Time.now < expiry_time)
if RUN_SCHEDULER == "true" || RUN_SCHEDULER == true
process_notes
end
sleep 10 #seconds
end
def self.process_notes
notes = nil
time = Benchmark.measure do
Note.uncached do
notes = Note.where(status: false)
notes.update_all(status: true)
end
end
puts "time #{time}"
end
end
Objective of my code is to change the boolean status of objects to true which gets checked every 10 seconds. This table has 2 million records.
I suggest using a Sidekiq background jobs for this. With the sidekiq-scheduler gem you can run ordinary sidekiq jobs schedules in whatever internal you need. Bonus points for having a web-interface to handle and monitor the jobs via the Sidekiq gem.
You would use the clockwork gem. It runs in a separate process. The configuration is pretty simple.
require 'clockwork'
include Clockwork
every(10.seconds, 'frequent.job') { DailyNotificationChecker.process_notes }

Retry Sidekiq worker from within worker

In my app I am trying to perform two worker tasks sequentially.
First, a PDF is being created with Wicked pdf and then, once the PDF is created, to send an email to two different recipients with the PDF attached.
This is what is called in the controller :
PdfWorker.perform_async(#d.id)
MailingWorker.perform_in(1.minutes, #d.id,#d.class.name.to_s)
First worker creates the PDF and second worker sends email.
Here is second worker :
class MailingWorker
include Sidekiq::Worker
sidekiq_options retry: false
def perform(d_id,model)
#d = eval(model).find(d_id)
#model = model
if #d.pdf.present?
ProfessionnelMailer.notification_d(#d).deliver
ClientMailer.notification_d(#d).deliver
else
MailingWorker.perform_in(1.minutes, #d.id, #model.to_s)
end
end
end
The if statement checks if the PDF has been created. If true two mails are sent, otherwise, the same worker is called again one minute later, just to let the Heroku server extra time to process the PDF creation in case it takes more time or a long queue.
Though if the PDF has definitely failed to be processed, the above ends up in an infinite loop.
Is there a way to fix this ?
One option I see is calling the second worker inside the PDF creation worker though I don't really want to nest workers too deep. It makes my controller more clear to have them separate, I can see the sequence of actions. But any advice welcome.
Another option is to use sidekiq_options retry: 5 and request a retry of the controller that could be counted towards the full total of 5 retries, instead of retrying the worker with else MailingWorker.perform_in(1.minutes, #d.id, #model.to_s) but I don't know how to do this. As per this thread https://github.com/mperham/sidekiq/issues/769 it would be to raise an exception but I am not sure how to do this ... (also I am not sure how long the retry will wait before being processed with the exception method, with the solution above I can control the time frame..)
If you do not want to have nested workers, then in MailingWorker instead of enqueuing it again, raise an exception if the PDF is not present.
Also, configure the worker retry option, so that sidekiq will push it to the retry queue and run it again in sometime. According to the documentation,
Sidekiq will retry failures with an exponential backoff using the
formula (retry_count ** 4) + 15 + (rand(30) * (retry_count + 1)) (i.e.
15, 16, 31, 96, 271, ... seconds + a random amount of time). It will
perform 25 retries over approximately 21 days.
Worker code will be more like,
class MailingWorker
include Sidekiq::Worker
sidekiq_options retry: 5
def perform(d_id,model)
#d = eval(model).find(d_id)
#model = model
if #d.pdf.present?
ProfessionnelMailer.notification_d(#d).deliver
ClientMailer.notification_d(#d).deliver
else
raise "PDF not present"
end
end
end
I believe the "correct" and most asynchroneous way to do this is to have two queues, and two workers:
Queue 1: CreatePdfWorker
Queue 2: SendPdfWorker
When the CreatePdfWorker has generated the PDF, it then enqueues the SendPdfWorker with the newly generated PDF and recipients.
This way, each worker can work independently and pluck from the queue asynchroneously, and you're not struggling against the design choices of Sidekiq.

Sidekiq jobs won't run in same time in different queues

I have 2 Sidekiq workers:
Foo:
​# frozen_string_literal: true
class FooWorker
include Sidekiq::Worker
sidekiq_options queue: :foo
def perform
loop do
File.open(File.join(Rails.root, 'foo.txt'), 'w') { |file| file.write('FOO') }
end
end
end
Bar:
# frozen_string_literal: true
class BarWorker
include Sidekiq::Worker
sidekiq_options queue: :bar
def perform
loop do
File.open(File.join(Rails.root, 'bar.txt'), 'w') { |file| file.write('BAR') }
end
end
end
Which has pretty the same functionality, both runs on different queues and the yaml file looks like this:
---
:queues:
- foo
- bar
development:
:concurrency: 5
The problem is, even both are running and showing in the Busy page of Sidekiq UI, only one of them will actually create a file and put contents in. Shouldn't Sidekiq be multi-threaded?
Update:
this happens only on my machine
i created a new project with rails new and same
i cloned a colleague project and ran his sidekiq and is working!!!
i used his sidekiq version, not working!
New Update:
this happens also on my colleague machine if he clone my project
if I run 2 jobs with a finite loop ( like 10 times do something with a sleep), first job will be executed and then the second, but after the second finishes and start again both will work on same time as expected -- everyone who cloned the project from: github.com/ArayB/sidekiq-test encountered the problem.
It's not an issue with Sidekiq. It's an issue somewhere in Ruby/MRI/Thread/GIL. Google for more info, but my understanding is that sometimes threads aren't real threads (see "green threads") so really just simulate threading. The important thing is that only one thread can execute at a time.
It's interesting that with only two threads the system isn't giving time to the second thread. No idea why, but it must realize it's mistake when you run it again.
Interestingly if you run your same app but instead fire off 10 TestWorkers (and tweak the output so you can tell the difference) sidekiq will run all 10 "at once".
10.times {|i| TestWorker.perform_async(i) }
Here is the tweaked worker. Be sure to flush the output cause that can also cause issues with threading and TTY buffering not reflecting reality.
class TestWorker
include Sidekiq::Worker
def perform(n)
10.times do |i|
puts "#{n} - #{i} - #{Time.current}"
$stdout.flush
sleep 1
end
end
end
Some interesting links:
https://en.wikipedia.org/wiki/Green_threads
http://ruby-doc.org/core-2.4.1/Thread.html#method-c-pass
https://github.com/ruby/ruby/blob/v2_4_1/thread.c
Does ruby have real multithreading?

Rails ActiveJob Job with sidekiq delaying 6/10 seconds

I am using Rails ActiveJob with Sidekiq.
I have a Job that is supposed to execute after 5 seconds.
UserArrivalJob.set(wait: 5.seconds).perform_later(user, planet)
Only after 5 seconds the job still hasnt ran.
When i look in the sidekiq web interface after those 5 seconds the job is there and it says: Not yet enqueued. After about another 6 till 10 seconds the job gets enqeued and is immediatly executed.
How come that there is this delay?
When i use perform now this delay is not there.
Here is my Job:
class UserArrivalJob < ActiveJob::Base
queue_as :default
def perform(user, planet)
user.planet = planet
user.save
end
end
Read here. Basically I think your sidekiq poller runs every 10 seconds and it picks the job when it pools.
bcd was right. I set the sidekiq configuration to run the poller every 2 seconds.
environments/development.rb / environments/production.rb
Sidekiq.configure_server do |config|
config.average_scheduled_poll_interval = 2
end

Resources