Can Gramex scheduler take dynamic schedule values from a file or database?

Can Gramex scheduler take dynamic schedule values from a file or database? - gramex

Trying to run some CLI on schedule like
create-job-once-a-day:
hours: <DYNAMIC VALUE>
minutes: <DYNAMIC VALUE>
thread: true
function: >
logging.critical('HELLO')
Is It possible to take the DYNAMIC VALUES from dynamic sources like a file?

One possibility is to run every minute, and check the time (or for any condition) in the function. For example:
create-job-once-a-day:
hours: '*'
minutes: '*'
thread: true
function: >
mymodule.my_schedule_function()
In mymodule.py:
def my_schedule_function():
if some_dynamic_condition_is_true():
logging.critical('HELLO')
# perform rest of the task
Schedules are quite efficient, so this won't impact the CPU.

Related

Measure the duration of x amount of requests while using K6

I would like to use K6 in order to measure the time it takes to proces 1.000.000 requests (in total) by an API.
Scenario
Execute 1.000.000 (1 million in total) get requests by 50 concurrent users/theads, so every user/thread executes 20.000 requests.
I've managed to create such a scenario with Artillery.io, but I'm not sure how to create the same one while using K6. Could you point me in the right direction in order to create the scenario? (Most examples are using a pre-defined duration, but in this case I don't know the duration -> this is exactly what I want to measure).
Artillery yml
config:
target: 'https://localhost:44000'
phases:
- duration: 1
arrivalRate: 50
scenarios:
- flow:
- loop:
- get:
url: "/api/Test"
count: 20000
K6 js
import http from 'k6/http';
import {check, sleep} from 'k6';
export let options = {
iterations: 1000000,
vus: 50
};
export default function() {
let res = http.get('https://localhost:44000/api/Test');
check(res, { 'success': (r) => r.status === 200 });
}

The iterations + vus you've specified in your k6 script options would result in a shared-iterations executor, where VUs will "steal" iterations from the common pile of 1m iterations. So, the faster VUs will complete slightly more than 20k requests, while the slower ones will complete slightly less, but overall you'd still get 1 million requests. And if you want to see how quickly you can complete 1m requests, that's arguably the better way to go about it...
However, if having exactly 20k requests per VU is a strict requirement, you can easily do that with the aptly named per-vu-iterations executor:
export let options = {
discardResponseBodies: true,
scenarios: {
'million_hits': {
executor: 'per-vu-iterations',
vus: 50,
iterations: 20000,
maxDuration: '2h',
},
},
};
In any case, I strongly suggest setting maxDuration to a high value, since the default value is only 10 minutes for either executor. And discardResponseBodies will probably help with the performance, if you don't care about the response body contents.
btw you can also do in k6 what you've done in Artillery, have 50 VUs start a single iteration each and then just loop the http.get() call 20000 times inside of that one single iteration... You won't get a very nice UX that way, the k6 progressbars will be frozen until the very end, since k6 will have no idea of your actual progress inside of each iteration, but it will also work.

waitForCompletion(timeout) in Abaqus API does not actually kill the job after timeout passes

I'm doing a parametric sweep of some Abaqus simulations, and so I'm using the waitForCompletion() function to prevent the script from moving on prematurely. However, occassionally the combination of parameters causes the simulation to hang on one or two of the parameters in the sweep for something like half an hour to an hour, whereas most parameter combos only take ~10 minutes. I don't need all the data points, so I'd rather sacrifice one or two results to power through more simulations in that time. Thus I tried to use waitForCompletion(timeout) as documented here. But it doesn't work - it ends up functioning just like an indefinite waitForCompletion, regardless of how low I set the wait time. I am using Abaqus 2017, and I was wondering if anyone else had gotten this function to work and if so how?
While I could use a workaround like adding a custom timeout function and using the kill() function on the job, I would prefer to use the built-in functionality of the Abaqus API, so any help is much appreciated!

It seems like starting from a certain version the timeOut optional argument was removed from this method: compare the "Scripting Reference Manual" entry in the documentation of v6.7 and v6.14.
You have a few options:
From Abaqus API: Checking if the my_abaqus_script.023 file still exists during simulation:
import os, time
timeOut = 600
total_time = 60
time.sleep(60)
# whait untill the the job is completed
while os.path.isfile('my_job_name.023') == True:
if total_time > timeOut:
my_job.kill()
total_time += 60
time.sleep(60)
From outside: Launching the job using the subprocess
Note: don't use interactive keyword in your command because it blocks the execution of the script while the simulation process is active.
import subprocess, os, time
my_cmd = 'abaqus job=my_abaqus_script analysis cpus=1'
proc = subprocess.Popen(
my_cmd,
cwd=my_working_dir,
stdout='my_study.log',
stderr='my_study.err',
shell=True
)
and checking the return code of the child process suing poll() (see also returncode):
timeOut = 600
total_time = 60
time.sleep(60)
# whait untill the the job is completed
while proc.poll() is None:
if total_time > timeOut:
proc.terminate()
total_time += 60
time.sleep(60)
or waiting until the timeOut is reached using wait()
timeOut = 600
try:
proc.wait(timeOut)
except subprocess.TimeoutExpired:
print('TimeOut reached!')
Note: I know that terminate() and wait() methods should work in theory but I haven't tried this solution myself. So maybe there will be some additional complications (like looking for all children processes created by Abaqus using psutil.Process(proc.pid) )

Is there an equivalent to Akka Streams' `conflate` and/or `batch` operators in Reactor?

I am looking for an equivalent of the batch and conflate operators from Akka Streams in Project Reactor, or some combination of operators that mimic their behavior.
The idea is to aggregate upstream items when the downstream backpressures in a reduce-like manner.
Note that this is different from this question because the throttleLatest / conflate operator described there is different from the one in Akka Streams.
Some background regarding what I need this for:
I am watching a change stream on a MongoDB and for every change I run an aggregate query on the MongoDB to update some metric. When lots of changes come in, the queries can't keep up and I'm getting errors. As I only need the latest value of the aggregate query, it is fine to aggregate multiple change events and run the aggregate query less often, but I want the metric to be as up-to-date as possible so I want to avoid waiting a fixed amount of time when there is no backpressure.
The closest I could come so far is this:
changeStream
.window(Duration.ofSeconds(1))
.concatMap { it.reduce(setOf<String>(), { applicationNames, event -> applicationNames + event.body.sourceReference.applicationName }) }
.concatMap { Flux.fromIterable(it) }
.concatMap { taskRepository.findTaskCountForApplication(it) }
but this would always wait for 1 second regardless of backpressure.
What I would like is something like this:
changeStream
.conflateWithSeed({setOf(it.body.sourceReference.applicationName)}, {applicationNames, event -> applicationNames + event.body.sourceReference.applicationName})
.concatMap { Flux.fromIterable(it) }
.concatMap { taskRepository.findTaskCountForApplication(it) }

I assume you always run only 1 query at the same time - no parallel execution. My idea is to buffer elements in list(which can be easily aggregated) as long as the query is running. As soon as the query finishes, another list is executed.
I tested it on a following code:
boolean isQueryRunning = false;
Flux.range(0, 1000000)
.delayElements(Duration.ofMillis(10))
.bufferUntil(aLong -> !isQueryRunning)
.doOnNext(integers -> isQueryRunning = true)
.concatMap(integers-> Mono.fromCallable(() -> {
int sleepTime = new Random().nextInt(10000);
System.out.println("processing " + integers.size() + " elements. Sleep time: " + sleepTime);
Thread.sleep(sleepTime);
return "";
})
.subscribeOn(Schedulers.elastic())
).doOnNext(s -> isQueryRunning = false)
.subscribe();
Which prints
processing 1 elements. Sleep time: 4585
processing 402 elements. Sleep time: 2466
processing 223 elements. Sleep time: 2613
processing 236 elements. Sleep time: 5172
processing 465 elements. Sleep time: 8682
processing 787 elements. Sleep time: 6780
Its clearly visible, that size of the next batch is proprortional to previous query execution time(Sleep time).
Note that it is not "real" backpressure solution, just a workaround. Also its not suited for parallel execution. It might also require some tuning in order to prevent running queries for empty batches.

Rufus Scheduler vs Loop with Sleep

I need to run an ActiveJob in my Rails 5 application as often as possible.
Up until now, I was using a cron job that was defined using the whenever gem (used Crono in the past).
This approach boots up the entire rails app before it does its thing, and then shuts down and does it all over again. I wish to avoid it, and instead have a persistent "service" that does the job.
I bumped into the Rufus Scheduler gem which seems nice, but then I started wondering if I need it at all.
So, my question is:
Are there any meaningful differences between these two options below:
# With Rufus
scheduler = Rufus::Scheduler.new
scheduler.every('1s') { puts 'hello' }
scheduler.join
# Without Rufus
loop { puts "hello" ; sleep 1 }
Note that either of these scripts will be executed with rails runner my_scheduler.rb as a docker container, and any solution must ensure the job runs once only (never two in parallel).

There are differences.
Rufus-scheduler every will try to run every 1s, so at t0 + 1, t0 + 2, t0 + 3, etc. While the "sleep" variant will run at t0, t0 + wt1 + 1, t0 + wt1 + 1 + wt2 + 1, ... (where wtN = work time for Nth call of puts hello).
Rufus-scheduler every will use rufus-scheduler work threads so that if there is an overlap (wt(Nb) > wt(Na)), Nb will occur anyway. This behaviour can be changed (see job options).
Rufus-scheduler interval is closer to your sleep option:
scheduler.interval('1s') { puts 'hello' }
It places 1 second (or whatever time interval you want) between each run of puts 'hello'.
For a simple do-sleep-loop, I'd go with the sleep option. But if you need more, rufus-scheduler has cron, interval, every and tuning options for them. And you can schedule multiple jobs.
scheduler.cron('0 1 1 * *') do
# every first of the month at 1am
flush_archive
end
scheduler.cron('0 8,13,19 * * mon-fri') do
# every weekday at 8am, 1pm, 7pm
clean_dishes
end
scheduler.interval('1s') do
puts 'hello'
end

Why is lua so slow in redis? Any workarounds?

I'm evaluating the use of lua scrips in redis, and they seem to be a bit slow. I a benchmark as follows:
For a non-lua version, I did a simple SET key_i val_i 1M times
For a lua version, I did the same thing, but in a script: EVAL "SET KEYS[1] ARGV[1]" 1 key_i val_i
Testing on my laptop, the lua version is about 3x slower than the non-lua version. I understand that lua is a scripting language, not compiled, etc. etc. but this seems like a lot of performance overhead--is this normal?
Assuming this is indeed normal, are there any workaround? Is there a way to implement a script in a faster language, such as C (which redis is written in) to achieve better performance?
Edit: I am testing this using the go code located here: https://gist.github.com/ortutay/6c4a02dee0325a608941

The problem is not with Lua or Redis; it's with your expectations. You are compiling a script 1 million times. There is no reason to expect this to be fast.
The purpose of EVAL within Redis is not to execute a single command; you could do that yourself. The purpose is to do complex logic within Redis itself, on the server rather than on your local client. That is, instead of doing one set operation per-EVAL, you actually perform the entire series of 1 million sets within a single EVAL script, which will be executed by the Redis server itself.
I don't know much about Go, so I can't write the syntax for calling it. But I know what the Lua script would look like:
for i = 1, ARGV[1] do
local key = "key:" .. tostring(i)
redis.call('SET', key, i)
end
Put that in a Go string, then pass that to the appropriate call, with no key arguments and a single non-key argument that is the number of times to loop.

I stumbled on this thread and was also curious of the benchmark results. I wrote a quick Ruby script to compare them. The script does a simple "SET/GET" operation on the same key using different options.
require "redis"
def elapsed_time(name, &block)
start = Time.now
block.call
puts "#{name} - elapsed time: #{(Time.now-start).round(3)}s"
end
iterations = 100000
redis_key = "test"
redis = Redis.new
elapsed_time "Scenario 1: From client" do
iterations.times { |i|
redis.set(redis_key, i.to_s)
redis.get(redis_key)
}
end
eval_script1 = <<-LUA
redis.call("SET", "#{redis_key}", ARGV[1])
return redis.call("GET", "#{redis_key}")
LUA
elapsed_time "Scenario 2: Using EVAL" do
iterations.times { |i|
redis.eval(eval_script1, [redis_key], [i.to_s])
}
end
elapsed_time "Scenario 3: Using EVALSHA" do
sha1 = redis.script "LOAD", eval_script1
iterations.times { |i|
redis.evalsha(sha1, [redis_key], [i.to_s])
}
end
eval_script2 = <<-LUA
for i = 1,#{iterations} do
redis.call("SET", "#{redis_key}", tostring(i))
redis.call("GET", "#{redis_key}")
end
LUA
elapsed_time "Scenario 4: Inside EVALSHA" do
sha1 = redis.script "LOAD", eval_script2
redis.evalsha(sha1, [redis_key], [])
end
eval_script3 = <<-LUA
for i = 1,2*#{iterations} do
redis.call("SET", "#{redis_key}", tostring(i))
redis.call("GET", "#{redis_key}")
end
LUA
elapsed_time "Scenario 5: Inside EVALSHA with 2x the operations" do
sha1 = redis.script "LOAD", eval_script3
redis.evalsha(sha1, [redis_key], [])
en
I got the following results running on my Macbook pro
Scenario 1: From client - elapsed time: 11.498s
Scenario 2: Using EVAL - elapsed time: 6.616s
Scenario 3: Using EVALSHA - elapsed time: 6.518s
Scenario 4: Inside EVALSHA - elapsed time: 0.241s
Scenario 5: Inside EVALSHA with 2x the operations - elapsed time: 0.5s
In summary:
scenario 1 vs. scenario 2 show that the main contributor is the round trip time as scenario 1 makes 2 requests to Redis while scenario 2 only makes 1 and scenario 1 is ~2x the execution time
scenario 2 vs. scenario 3 shows that EVALSHA does provide some benefit and I am sure this benefit increases the more complex the script gets
scenario 4 vs scenario 5 shows the overhead of invoking the script is near minimal as we doubled the number of operations and saw a ~2x increase in execution time.

so there is now a workaround using a module created by John Sully. It works for Redis and KeyDB and allows you to use the V8 JIT engine which runs complex scripts much faster than Lua scripts. https://github.com/JohnSully/ModJS

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Can Gramex scheduler take dynamic schedule values from a file or database? - gramex

Trying to run some CLI on schedule like create-job-once-a-day: hours: <DYNAMIC VALUE> minutes: <DYNAMIC VALUE> thread: true function: > logging.critical('HELLO') Is It possible to take the DYNAMIC VALUES from dynamic sources like a file?

Related

Measure the duration of x amount of requests while using K6

waitForCompletion(timeout) in Abaqus API does not actually kill the job after timeout passes

Is there an equivalent to Akka Streams' `conflate` and/or `batch` operators in Reactor?

Rufus Scheduler vs Loop with Sleep

Why is lua so slow in redis? Any workarounds?

Categories

Resources