Rufus Scheduler vs Loop with Sleep - ruby-on-rails

I need to run an ActiveJob in my Rails 5 application as often as possible.
Up until now, I was using a cron job that was defined using the whenever gem (used Crono in the past).
This approach boots up the entire rails app before it does its thing, and then shuts down and does it all over again. I wish to avoid it, and instead have a persistent "service" that does the job.
I bumped into the Rufus Scheduler gem which seems nice, but then I started wondering if I need it at all.
So, my question is:
Are there any meaningful differences between these two options below:
# With Rufus
scheduler = Rufus::Scheduler.new
scheduler.every('1s') { puts 'hello' }
scheduler.join
# Without Rufus
loop { puts "hello" ; sleep 1 }
Note that either of these scripts will be executed with rails runner my_scheduler.rb as a docker container, and any solution must ensure the job runs once only (never two in parallel).

There are differences.
Rufus-scheduler every will try to run every 1s, so at t0 + 1, t0 + 2, t0 + 3, etc. While the "sleep" variant will run at t0, t0 + wt1 + 1, t0 + wt1 + 1 + wt2 + 1, ... (where wtN = work time for Nth call of puts hello).
Rufus-scheduler every will use rufus-scheduler work threads so that if there is an overlap (wt(Nb) > wt(Na)), Nb will occur anyway. This behaviour can be changed (see job options).
Rufus-scheduler interval is closer to your sleep option:
scheduler.interval('1s') { puts 'hello' }
It places 1 second (or whatever time interval you want) between each run of puts 'hello'.
For a simple do-sleep-loop, I'd go with the sleep option. But if you need more, rufus-scheduler has cron, interval, every and tuning options for them. And you can schedule multiple jobs.
scheduler.cron('0 1 1 * *') do
# every first of the month at 1am
flush_archive
end
scheduler.cron('0 8,13,19 * * mon-fri') do
# every weekday at 8am, 1pm, 7pm
clean_dishes
end
scheduler.interval('1s') do
puts 'hello'
end

Related

waitForCompletion(timeout) in Abaqus API does not actually kill the job after timeout passes

I'm doing a parametric sweep of some Abaqus simulations, and so I'm using the waitForCompletion() function to prevent the script from moving on prematurely. However, occassionally the combination of parameters causes the simulation to hang on one or two of the parameters in the sweep for something like half an hour to an hour, whereas most parameter combos only take ~10 minutes. I don't need all the data points, so I'd rather sacrifice one or two results to power through more simulations in that time. Thus I tried to use waitForCompletion(timeout) as documented here. But it doesn't work - it ends up functioning just like an indefinite waitForCompletion, regardless of how low I set the wait time. I am using Abaqus 2017, and I was wondering if anyone else had gotten this function to work and if so how?
While I could use a workaround like adding a custom timeout function and using the kill() function on the job, I would prefer to use the built-in functionality of the Abaqus API, so any help is much appreciated!
It seems like starting from a certain version the timeOut optional argument was removed from this method: compare the "Scripting Reference Manual" entry in the documentation of v6.7 and v6.14.
You have a few options:
From Abaqus API: Checking if the my_abaqus_script.023 file still exists during simulation:
import os, time
timeOut = 600
total_time = 60
time.sleep(60)
# whait untill the the job is completed
while os.path.isfile('my_job_name.023') == True:
if total_time > timeOut:
my_job.kill()
total_time += 60
time.sleep(60)
From outside: Launching the job using the subprocess
Note: don't use interactive keyword in your command because it blocks the execution of the script while the simulation process is active.
import subprocess, os, time
my_cmd = 'abaqus job=my_abaqus_script analysis cpus=1'
proc = subprocess.Popen(
my_cmd,
cwd=my_working_dir,
stdout='my_study.log',
stderr='my_study.err',
shell=True
)
and checking the return code of the child process suing poll() (see also returncode):
timeOut = 600
total_time = 60
time.sleep(60)
# whait untill the the job is completed
while proc.poll() is None:
if total_time > timeOut:
proc.terminate()
total_time += 60
time.sleep(60)
or waiting until the timeOut is reached using wait()
timeOut = 600
try:
proc.wait(timeOut)
except subprocess.TimeoutExpired:
print('TimeOut reached!')
Note: I know that terminate() and wait() methods should work in theory but I haven't tried this solution myself. So maybe there will be some additional complications (like looking for all children processes created by Abaqus using psutil.Process(proc.pid) )

Rufus scheduler fires at the scheduled time (expected ) and fires again after completion of previous run

I have scheduled rufus cron to run a method (a method to post a complex query in remote DB, retrieves result and, do some processing ) every hour and 99% of time the methods completes execution <1 hr.
Rufus fires the method fine at every expected time, but the problem is some times (1% fo time) the method wont complete in a hour; the scheduler fires the second one at the expected time and fires the second one again, on completion of previous run.
For example,
pin_sampling_job_scheduler = Rufus::Scheduler.new
pin_sampling_job_scheduler.cron("22 * * * *") do
puts "sampling starts"
DatafetchRedshift.pin_sampling_job
end
for hour = 00, the method fires at 22nd min(as expected and assume the job is running for 1.5 hour) For hour =01, method is fires at 22nd min and again at 52nd min (after completion of first run). Why is it happening like this? Any help or suggestion will be appreciated.
Rufus version -3.3.4
Rails - 4.1.6
Ruby 2.2.0
P.S - I wont be able to change the scheduler runs to be more than an hour, since 99% it completes
https://github.com/jmettraux/rufus-scheduler#overlap--false
pin_sampling_job_scheduler = Rufus::Scheduler.new
pin_sampling_job_scheduler.cron("22 * * * *", overlap: false) do
ts = Time.now.strftime('%Y%m%d%H%M')
puts "#{Time.now} - sampling #{ts} starts..."
DatafetchRedshift.pin_sampling_job
puts "#{Time.now} - sampling #{ts} stopped."
end
for hour = 00, the method fires at 22nd min(as expected and assume the job is running for 1.5 hour) For hour =01, method is fires at 22nd min and again at 52nd min (after completion of first run). Why is it happening like this?
You can report issues at https://github.com/jmettraux/rufus-scheduler/issues (please read https://github.com/jmettraux/rufus-scheduler#getting-help before filing an issue)

Why is lua so slow in redis? Any workarounds?

I'm evaluating the use of lua scrips in redis, and they seem to be a bit slow. I a benchmark as follows:
For a non-lua version, I did a simple SET key_i val_i 1M times
For a lua version, I did the same thing, but in a script: EVAL "SET KEYS[1] ARGV[1]" 1 key_i val_i
Testing on my laptop, the lua version is about 3x slower than the non-lua version. I understand that lua is a scripting language, not compiled, etc. etc. but this seems like a lot of performance overhead--is this normal?
Assuming this is indeed normal, are there any workaround? Is there a way to implement a script in a faster language, such as C (which redis is written in) to achieve better performance?
Edit: I am testing this using the go code located here: https://gist.github.com/ortutay/6c4a02dee0325a608941
The problem is not with Lua or Redis; it's with your expectations. You are compiling a script 1 million times. There is no reason to expect this to be fast.
The purpose of EVAL within Redis is not to execute a single command; you could do that yourself. The purpose is to do complex logic within Redis itself, on the server rather than on your local client. That is, instead of doing one set operation per-EVAL, you actually perform the entire series of 1 million sets within a single EVAL script, which will be executed by the Redis server itself.
I don't know much about Go, so I can't write the syntax for calling it. But I know what the Lua script would look like:
for i = 1, ARGV[1] do
local key = "key:" .. tostring(i)
redis.call('SET', key, i)
end
Put that in a Go string, then pass that to the appropriate call, with no key arguments and a single non-key argument that is the number of times to loop.
I stumbled on this thread and was also curious of the benchmark results. I wrote a quick Ruby script to compare them. The script does a simple "SET/GET" operation on the same key using different options.
require "redis"
def elapsed_time(name, &block)
start = Time.now
block.call
puts "#{name} - elapsed time: #{(Time.now-start).round(3)}s"
end
iterations = 100000
redis_key = "test"
redis = Redis.new
elapsed_time "Scenario 1: From client" do
iterations.times { |i|
redis.set(redis_key, i.to_s)
redis.get(redis_key)
}
end
eval_script1 = <<-LUA
redis.call("SET", "#{redis_key}", ARGV[1])
return redis.call("GET", "#{redis_key}")
LUA
elapsed_time "Scenario 2: Using EVAL" do
iterations.times { |i|
redis.eval(eval_script1, [redis_key], [i.to_s])
}
end
elapsed_time "Scenario 3: Using EVALSHA" do
sha1 = redis.script "LOAD", eval_script1
iterations.times { |i|
redis.evalsha(sha1, [redis_key], [i.to_s])
}
end
eval_script2 = <<-LUA
for i = 1,#{iterations} do
redis.call("SET", "#{redis_key}", tostring(i))
redis.call("GET", "#{redis_key}")
end
LUA
elapsed_time "Scenario 4: Inside EVALSHA" do
sha1 = redis.script "LOAD", eval_script2
redis.evalsha(sha1, [redis_key], [])
end
eval_script3 = <<-LUA
for i = 1,2*#{iterations} do
redis.call("SET", "#{redis_key}", tostring(i))
redis.call("GET", "#{redis_key}")
end
LUA
elapsed_time "Scenario 5: Inside EVALSHA with 2x the operations" do
sha1 = redis.script "LOAD", eval_script3
redis.evalsha(sha1, [redis_key], [])
en
I got the following results running on my Macbook pro
Scenario 1: From client - elapsed time: 11.498s
Scenario 2: Using EVAL - elapsed time: 6.616s
Scenario 3: Using EVALSHA - elapsed time: 6.518s
Scenario 4: Inside EVALSHA - elapsed time: 0.241s
Scenario 5: Inside EVALSHA with 2x the operations - elapsed time: 0.5s
In summary:
scenario 1 vs. scenario 2 show that the main contributor is the round trip time as scenario 1 makes 2 requests to Redis while scenario 2 only makes 1 and scenario 1 is ~2x the execution time
scenario 2 vs. scenario 3 shows that EVALSHA does provide some benefit and I am sure this benefit increases the more complex the script gets
scenario 4 vs scenario 5 shows the overhead of invoking the script is near minimal as we doubled the number of operations and saw a ~2x increase in execution time.
so there is now a workaround using a module created by John Sully. It works for Redis and KeyDB and allows you to use the V8 JIT engine which runs complex scripts much faster than Lua scripts. https://github.com/JohnSully/ModJS

I want Jenkins job to build every two weeks

Will this expression run the build every other Friday at noon? Assume i set this up on a Friday?
0 12 * * */14
I tried 0 12 * * FRI/14 but Jenkins returned an error.
I ma trying to run a code report job every two weeks to match scrum.
You'll have to add some logic to the build script to determine if it ran last week, and then run it every week.
I looked around similar questions for cron jobs, and you have to do some shell magic to make it work.
You could try what was suggested here:
H H 8-14,22-28 * 5
Which would qualify on Fridays that are in the second or fourth week of the month.
it will run at noon every other friday
00 12 */2 * 5
I had the same issue, the easy work around I found was to create another job that run weekly.
This job was a simple groovy script that does the following:
import jenkins.model.*;
def job = Jenkins.instance.getJob('JobNameToRunEveryTwoWeek')
job.setDisabled(!job.isDisabled())
Since Jenkins does not offer the functionnality its the best easy solution I could find. If you have better solution feel free to let me know.
One ridiculous-looking-but-it-works answer: schedule your job to run every week, and then at the top of the job add the following:
// Suppressing even number builds, so this job only runs
// every other week.
def build_number = env.BUILD_NUMBER as int
if ((build_number % 2) == 0) {
echo "Suppressing even number builds!"
echo """THIS IS A HACK TO MAKE THIS JOB RUN BIWEEKLY.
Jenkins cron scheduling currently doesn't support scheduling a
bi-weekly job. We could resort to shell or other tricks to
calculate if the job should be run (e.g., comparing to the date
of the last run job), it's annoying, and this works just as well.
Schedule this job to run weekly. It will exit early every other week.
refs:
* https://stackoverflow.com/questions/33785196/i-want-jenkins-job-to-build-every-two-weeks
* https://issues.jenkins-ci.org/browse/JENKINS-19756
"""
currentBuild.result = 'SUCCESS'
return
}
For Jenkins, you can try this approach as well.
1 1 8-14,21-28 * 5

Jenkins Cron Expression Not Scheduled at Right time

All,
Tried to configure jenkins job to trigger at EVERYDAY 10AM and used below cron H 10 * * * but the jenkins console is not running at 10AM rather its running at 10.09AM. Please help me to run at 10AM everyday around the year.
update: After adding the expression with '0 10 * * *', got below warning and no next run time is displayed. is that normal?
30 5 * * *
will run every day at 5:30 AM
#daily
will run the job once a day at some time, chosen by Jenkins
0 10 * * *
will run every day at 10:00 AM
When not using 'H' in the beginning you will get the warning, you will not get a tip when the job will or would have run, but it will still be active, i.e. it will run as per the statement.
You will always see a syntax error in red color when making any syntax errors.
Also, a good idea might be to create a dummy job to experiment with cron trigger, if you don't feel comfortable with it yet. Or use crontab on Linux:
crontab -e
man crontab
If you really want it at 10AM, please use 0 10 * * *
or 5:30 AM as you asked in comment
30 5 * * *
Jenkins will warn you in this case. If every job schedules like this, load suddenly goes up. Jenkins advice you to differ job time a bit. The H indicate once in every hour, not particularly at 0th minute.

Resources