I am trying to design load tests using Artillery on a computation-heavy API which typically requires at least a few seconds to send a response.
Starting from examples found in the docs, I was able to run some tests such as this one:
config:
target: "https://example.com/api"
phases:
- duration: 60
arrivalRate: 1
name: Base case
I'd now like to send requests even slower (e.g. 1 every 5 seconds) but it seems that this cannot be done using the arrivalRate parameter. Is there any way to do it that the docs do not mention?
Thanks in advance !
How can I customize interval time of virtual user creation?
You can do that with arrivalCount which spreads the creation of
virtual users evenly over a period of time (whereas arrivalRate is
always per-second). E.g.:
config:
phases:
- duration: 60
arrivalCount: 12
Related
is there a way to force a DAtaflow job to kill itself if it is running longer than xxx hours?
Kind regards
Marco
Posting the comment as an answer.
We are in the process of implementing this for Batch Pipelines. It is not yet available as a Dataflow flag, but it will be within a month.
We recently implemented this feature for Dataflow. You would do it by passing an extra experiment:
--experiments=max_workflow_runtime_walltime_seconds=300
Or whatever number of seconds.
Programatically this would be like so:
String experimentValue = String.format(
"max_workflow_runtime_walltime_seconds=%d",
killAfterSeconds);
ExperimentalOptions.addExperiment(myOptions.as(ExperimentalOptions.class), experimentValue);
In Python:
experiment_value = "max_workflow_runtime_walltime_seconds=%d" % timeout_secs
my_options.view_as(DebugOptions).add_experiment(experiment_value)
I am using terraform resource google_dataflow_flex_template_job to deploy a Dataflow flex template job.
resource "google_dataflow_flex_template_job" "streaming_beam" {
provider = google-beta
name = "streaming-beam"
container_spec_gcs_path = module.streaming_beam_flex_template_file[0].fully_qualified_path
parameters = {
"input_subscription" = google_pubsub_subscription.ratings[0].id
"output_table" = "${var.project}:beam_samples.streaming_beam_sql"
"service_account_email" = data.terraform_remote_state.state.outputs.sa.email
"network" = google_compute_network.network.name
"subnetwork" = "regions/${google_compute_subnetwork.subnet.region}/subnetworks/${google_compute_subnetwork.subnet.name}"
}
}
Its all working fine however without my requesting it the job seems to be using flexible resource scheduling (flexRS) mode, I say this because the job takes about ten minutes to start and during that time has state=QUEUED which I think is only applicable to flexRS jobs.
Using flexRS mode is fine for production scenarios however I'm currently still developing my dataflow job and when doing so flexRS is massively inconvenient because it takes about 10 minutes to see the effect of any changes I might make, no matter how small.
In Enabling FlexRS it is stated
To enable a FlexRS job, use the following pipeline option:
--flexRSGoal=COST_OPTIMIZED, where the cost-optimized goal means that the Dataflow service chooses any available discounted resources or
--flexRSGoal=SPEED_OPTIMIZED, where it optimizes for lower execution time.
I then found the following statement:
To turn on FlexRS, you must specify the value COST_OPTIMIZED to allow the Dataflow service to choose any available discounted resources.
at Specifying pipeline execution parameters > Setting other Cloud Dataflow pipeline options
I interpret that to mean that flexrs_goal=SPEED_OPTIMIZED will turn off flexRS mode. However, I changed the definition of my google_dataflow_flex_template_job resource to:
resource "google_dataflow_flex_template_job" "streaming_beam" {
provider = google-beta
name = "streaming-beam"
container_spec_gcs_path = module.streaming_beam_flex_template_file[0].fully_qualified_path
parameters = {
"input_subscription" = google_pubsub_subscription.ratings[0].id
"output_table" = "${var.project}:beam_samples.streaming_beam_sql"
"service_account_email" = data.terraform_remote_state.state.outputs.sa.email
"network" = google_compute_network.network.name
"subnetwork" = "regions/${google_compute_subnetwork.subnet.region}/subnetworks/${google_compute_subnetwork.subnet.name}"
"flexrs_goal" = "SPEED_OPTIMIZED"
}
}
(note the addition of "flexrs_goal" = "SPEED_OPTIMIZED") but it doesn't seem to make any difference. The Dataflow UI confirms I have set SPEED_OPTIMIZED:
but it still takes too long (9 minutes 46 seconds) for the job to start processing data, and it was in state=QUEUED for all that time:
2021-01-17 19:49:19.021 GMTStarting GCE instance, launcher-2021011711491611239867327455334861, to launch the template.
...
...
2021-01-17 19:59:05.381 GMTStarting 1 workers in europe-west1-d...
2021-01-17 19:59:12.256 GMTVM, launcher-2021011711491611239867327455334861, stopped.
I then tried explictly setting flexrs_goal=COST_OPTIMIZED just to see if it made any difference, but this only caused an error:
"The workflow could not be created. Causes: The workflow could not be
created due to misconfiguration. The experimental feature
flexible_resource_scheduling is not supported for streaming jobs.
Contact Google Cloud Support for further help. "
This makes sense. My job is indeed a streaming job and the documentation does indeed state that flexRS is only for batch jobs.
This page explains how to enable Flexible Resource Scheduling (FlexRS) for autoscaled batch pipelines in Dataflow.
https://cloud.google.com/dataflow/docs/guides/flexrs
This doesn't solve my problem though. As I said above if I deploy with flexrs_goal=SPEED_OPTIMIZED then still state=QUEUED for almost ten minutes, yet as far as I know QUEUED is only applicable to flexRS jobs:
Therefore, after you submit a FlexRS job, your job displays an ID and a Status of Queued
https://cloud.google.com/dataflow/docs/guides/flexrs#delayed_scheduling
Hence I'm very confused:
Why is my job getting queued even though it is not a flexRS job?
Why does it take nearly ten minutes for my job to start processing any data?
How can I speed up the time it takes for my job to start processing data so that I can get quicker feedback during development/testing?
UPDATE, I dug a bit more into the logs to find out what was going on during those 9minutes 46 seconds. These two consecutive log messages are 7 minutes 23 seconds apart:
2021-01-17 19:51:03.381 GMT
"INFO:apache_beam.runners.portability.stager:Executing command: ['/usr/local/bin/python', '-m', 'pip', 'download', '--dest', '/tmp/dataflow-requirements-cache', '-r', '/dataflow/template/requirements.txt', '--exists-action', 'i', '--no-binary', ':all:']"
2021-01-17 19:58:26.459 GMT
"INFO:apache_beam.runners.portability.stager:Downloading source distribution of the SDK from PyPi"
Whatever is going on between those two log records is the main contributor to the long time spent in state=QUEUED. Anyone know what might be the cause?
As mentioned in the existing answer you need to extract the apache-beam modules inside your requirements.txt:
RUN pip install -U apache-beam==<version>
RUN pip install -U -r ./requirements.txt
While developing, I prefer to use DirectRunner, for the fastest feedback.
Since our test agents are slow some times- i am trying to add some additional time outs for some commands
I did it like using time out value on command as shown below.But its not respecting the value given
My understanding is cypress will wait for "10000" MS for getting the #Addstory element?
Can any one advice is this is the correct way please?
Thank you so much
cy.get('#addstory > .ng-scope').click({ timeout: 10000 })
In cypress.json file, increase the timeout to 10 seconds or what ever timeout you want like this: "defaultCommandTimeout": 10000 and save the file. Now close the app and open it again. Navigate to Settings > Configuration you should be able to see the new value set for defaultCommandTimeout.
I issue was i was adding time out on click not for getting the element when i changed like below -All good waiting for add story to be visible as i expected before click
cy.get('#addstory > .ng-scope',{ timeout: 10000 }).click()
So, in my collection I have about ten requests, with the last two being:
/Wait 10 seconds
/Check Complete
The first makes a call to the postman's echo (delay by 10 seconds) and the second is the call to my system to check for the status complete. Now, if status is unavailable I wait another 10s:
postman.setNextRequest("Wait 10 seconds");
The complete status on my system can appear in a minute or so. Now, as one can see - it is an infinite loop if something goes wrong with the system and status is never complete. Is there a way in postman/newman test to fail a test if it has been going for more than 2 minutes, for example.
Additionally, this will be executed in jenkins with command line, so I am not really looking into postman settings or delays between requests in the runner.
you may have a look to newman options here : https://www.npmjs.com/package/newman#newman-run-collection-file-source-options. The interesting option is
--timeout-request : it will surely fulfill your need.
In Postman itself, you may test the responseTime. I recall that there is a snippet, on the right part, which looks like this:
tests["Response time is less than 200ms"] = responseTime < 200;
and which could help you as the test fails if response does not occur within the requested time.
Alexandre
If you are going to be using Jenkins pipeline you can use the timeout step to cause long running jobs to result in failure, here's on for 2 mins.
timeout(120) {
node {
sh 'newman command'
}
}
Check out the "Pipeline Syntax" editor in Jenkins to generated your code block and look for other useful functions.
I would like to configure a global retry limit in Sidekiq to limit the number of retries. By default Sidekiq limits the number of retries to 25 but I want to set it lower for all Workers to prevent the long default maximum retry period if the limit is not explicitly specified on the Worker.
You can also configure in your sidekiq.yml
:max_retries: 10
:queues:
- queue_1
- queue_2
Refer doc here
Sidekiq.default_worker_options['retry'] = 10
https://github.com/mperham/sidekiq/wiki/Advanced-Options#workers
This value is being stored in options and (AFAIK) has no nifty setter for it, so here you go:
Sidekiq.options[:max_retries] = 5
It might be set for RetryJobs in the middleware initializer as well.
You can use Sidekiq.default_worker_options in your initializer. So to set a lower limit it'd be
Sidekiq.default_worker_options = { retry: 5 }
Currently working on setting this up to limit the amount of error noise created by our staging environments (for the sake of trying to stay well below our error handling service limits). It seems that the key is now max_retries when changing the amount, and retry is a boolean for whether it should retry at all or go right to the "Dead" queue.
https://github.com/mperham/sidekiq/wiki/Error-Handling#automatic-job-retry
This is what it looks for me in my Sidekiq config file:
if Rails.env.staging?
Sidekiq.default_worker_options['max_retries'] = 5
end
UPDATE: could have been my own confusion, but for some reason, default_worker_options did not seem to be working consistently for me. I ended up changing it to this and it worked as I hoped. Failed jobs went straight to the Dead queue:
Sidekiq.options[:max_retries] = 0