How to get Jenkins build job duration

How to get Jenkins build job duration - jenkins

Is it possible to get Jenkins build duration using scripts or using some inbuilt functionality. I tried with ${BUILD_DURATION} but it didn't work.
Can anyone please suggest?

Here are a couple of options:
RESTFUL API
Use a language of your choice to consume Jenkins Restful API to get the json for the build and deserialise.
DECLARATIVE PIPELINE
Use ${currentBuild.durationString}.
The currentBuild object exposes a number of relevant attributes:
timeInMillis:
time since the epoch when the build was scheduled
startTimeInMillis
time since the epoch when the build started running
duration
duration of the build in milliseconds
durationString
a human-readable representation of the build duration
Navigate to https://<your-jenkins-instance>/pipeline-syntax/globals#currentBuild for the complete list.

I am Jenkins beginner & below Groovy script is an easy approach;
def item = Jenkins.instance.getItem("<Job_name>")
def last_job_duration = item.getLastBuild().getDurationString()
println last_job_duration

Related

Is there a way to force a Dataflow job to stop after it is running for a long time

is there a way to force a DAtaflow job to kill itself if it is running longer than xxx hours?
Kind regards
Marco

Posting the comment as an answer.
We are in the process of implementing this for Batch Pipelines. It is not yet available as a Dataflow flag, but it will be within a month.

We recently implemented this feature for Dataflow. You would do it by passing an extra experiment:
--experiments=max_workflow_runtime_walltime_seconds=300
Or whatever number of seconds.
Programatically this would be like so:
String experimentValue = String.format(
"max_workflow_runtime_walltime_seconds=%d",
killAfterSeconds);
ExperimentalOptions.addExperiment(myOptions.as(ExperimentalOptions.class), experimentValue);
In Python:
experiment_value = "max_workflow_runtime_walltime_seconds=%d" % timeout_secs
my_options.view_as(DebugOptions).add_experiment(experiment_value)

Can I make flex template jobs take less than 10 minutes before they start to process data?

I am using terraform resource google_dataflow_flex_template_job to deploy a Dataflow flex template job.
resource "google_dataflow_flex_template_job" "streaming_beam" {
provider = google-beta
name = "streaming-beam"
container_spec_gcs_path = module.streaming_beam_flex_template_file[0].fully_qualified_path
parameters = {
"input_subscription" = google_pubsub_subscription.ratings[0].id
"output_table" = "${var.project}:beam_samples.streaming_beam_sql"
"service_account_email" = data.terraform_remote_state.state.outputs.sa.email
"network" = google_compute_network.network.name
"subnetwork" = "regions/${google_compute_subnetwork.subnet.region}/subnetworks/${google_compute_subnetwork.subnet.name}"
}
}
Its all working fine however without my requesting it the job seems to be using flexible resource scheduling (flexRS) mode, I say this because the job takes about ten minutes to start and during that time has state=QUEUED which I think is only applicable to flexRS jobs.
Using flexRS mode is fine for production scenarios however I'm currently still developing my dataflow job and when doing so flexRS is massively inconvenient because it takes about 10 minutes to see the effect of any changes I might make, no matter how small.
In Enabling FlexRS it is stated
To enable a FlexRS job, use the following pipeline option:
--flexRSGoal=COST_OPTIMIZED, where the cost-optimized goal means that the Dataflow service chooses any available discounted resources or
--flexRSGoal=SPEED_OPTIMIZED, where it optimizes for lower execution time.
I then found the following statement:
To turn on FlexRS, you must specify the value COST_OPTIMIZED to allow the Dataflow service to choose any available discounted resources.
at Specifying pipeline execution parameters > Setting other Cloud Dataflow pipeline options
I interpret that to mean that flexrs_goal=SPEED_OPTIMIZED will turn off flexRS mode. However, I changed the definition of my google_dataflow_flex_template_job resource to:
resource "google_dataflow_flex_template_job" "streaming_beam" {
provider = google-beta
name = "streaming-beam"
container_spec_gcs_path = module.streaming_beam_flex_template_file[0].fully_qualified_path
parameters = {
"input_subscription" = google_pubsub_subscription.ratings[0].id
"output_table" = "${var.project}:beam_samples.streaming_beam_sql"
"service_account_email" = data.terraform_remote_state.state.outputs.sa.email
"network" = google_compute_network.network.name
"subnetwork" = "regions/${google_compute_subnetwork.subnet.region}/subnetworks/${google_compute_subnetwork.subnet.name}"
"flexrs_goal" = "SPEED_OPTIMIZED"
}
}
(note the addition of "flexrs_goal" = "SPEED_OPTIMIZED") but it doesn't seem to make any difference. The Dataflow UI confirms I have set SPEED_OPTIMIZED:
but it still takes too long (9 minutes 46 seconds) for the job to start processing data, and it was in state=QUEUED for all that time:
2021-01-17 19:49:19.021 GMTStarting GCE instance, launcher-2021011711491611239867327455334861, to launch the template.
...
...
2021-01-17 19:59:05.381 GMTStarting 1 workers in europe-west1-d...
2021-01-17 19:59:12.256 GMTVM, launcher-2021011711491611239867327455334861, stopped.
I then tried explictly setting flexrs_goal=COST_OPTIMIZED just to see if it made any difference, but this only caused an error:
"The workflow could not be created. Causes: The workflow could not be
created due to misconfiguration. The experimental feature
flexible_resource_scheduling is not supported for streaming jobs.
Contact Google Cloud Support for further help. "
This makes sense. My job is indeed a streaming job and the documentation does indeed state that flexRS is only for batch jobs.
This page explains how to enable Flexible Resource Scheduling (FlexRS) for autoscaled batch pipelines in Dataflow.
https://cloud.google.com/dataflow/docs/guides/flexrs
This doesn't solve my problem though. As I said above if I deploy with flexrs_goal=SPEED_OPTIMIZED then still state=QUEUED for almost ten minutes, yet as far as I know QUEUED is only applicable to flexRS jobs:
Therefore, after you submit a FlexRS job, your job displays an ID and a Status of Queued
https://cloud.google.com/dataflow/docs/guides/flexrs#delayed_scheduling
Hence I'm very confused:
Why is my job getting queued even though it is not a flexRS job?
Why does it take nearly ten minutes for my job to start processing any data?
How can I speed up the time it takes for my job to start processing data so that I can get quicker feedback during development/testing?
UPDATE, I dug a bit more into the logs to find out what was going on during those 9minutes 46 seconds. These two consecutive log messages are 7 minutes 23 seconds apart:
2021-01-17 19:51:03.381 GMT
"INFO:apache_beam.runners.portability.stager:Executing command: ['/usr/local/bin/python', '-m', 'pip', 'download', '--dest', '/tmp/dataflow-requirements-cache', '-r', '/dataflow/template/requirements.txt', '--exists-action', 'i', '--no-binary', ':all:']"
2021-01-17 19:58:26.459 GMT
"INFO:apache_beam.runners.portability.stager:Downloading source distribution of the SDK from PyPi"
Whatever is going on between those two log records is the main contributor to the long time spent in state=QUEUED. Anyone know what might be the cause?

As mentioned in the existing answer you need to extract the apache-beam modules inside your requirements.txt:
RUN pip install -U apache-beam==<version>
RUN pip install -U -r ./requirements.txt

While developing, I prefer to use DirectRunner, for the fastest feedback.

How can I programmatically cancel a Dataflow job that has run for too long?

I'm using Apache Beam on Dataflow through Python API to read data from Bigquery, process it, and dump it into Datastore sink.
Unfortunately, quite often the job just hangs indefinitely and I have to manually stop it. While the data gets written into Datastore and Redis, from the Dataflow graph I've noticed that it's only a couple of entries that get stuck and leave the job hanging.
As a result, when a job with fifteen 16-core machines is left running for 9 hours (normally, the job runs for 30 minutes), it leads to huge costs.
Maybe there is a way to set a timer that would stop a Dataflow job if it exceeds a time limit?

It would be great if you can create a customer support ticket where we would could try to debug this with you.
Maybe there is a way to set a timer that would stop a Dataflow job if
it exceeds a time limit?
Unfortunately the answer is no, Dataflow does not have an automatic way to cancel a job after a certain time. However, it is possible to do this using the APIs. It is possible to wait_until_finish() with a timeout then cancel() the pipeline.
You would do this like so:
p = beam.Pipeline(options=pipeline_options)
p | ... # Define your pipeline code
pipeline_result = p.run() # doesn't do anything
pipeline_result.wait_until_finish(duration=TIME_DURATION_IN_MS)
pipeline_result.cancel() # If the pipeline has not finished, you can cancel it

To sum up, with the help of #ankitk answer, this works for me (python 2.7, sdk 2.14):
pipe = beam.Pipeline(options=pipeline_options)
... # main pipeline code
run = pipe.run() # doesn't do anything
run.wait_until_finish(duration=3600000) # (ms) actually starts a job
run.cancel() # cancels if can be cancelled
Thus, in case if a job was successfully finished within the duration time in wait_until_finished() then cancel() will just print a warning "already closed", otherwise it will close a running job.
P.S. if you try to print the state of a job
state = run.wait_until_finish(duration=3600000)
logging.info(state)
it will be RUNNING for the job that wasn't finished within wait_until_finished(), and DONE for finished job.

Note: this technique will not work when running Beam from within a Flex Template Job...

The run.cancel() method doesn't work if you are writing a template and I haven't seen any successful work around it...

Scripting within Jenkins job

Main problem:
In the Jenkins build, I have a variable GIT_BRANCH of this content: origin/feature/JIRA1-add-component-A.
From it I would like to obtain another variable with the value: %3Aorigin%2Ffeature%2FJIRA1-add-component-A
For this, I need scripting. How can I do scripting in the job definition?
Context & further explanation:
From Jenkins, for every pull request that my project has, I am creating an instance of a project in SonarQube.
For example, if I have a pull request from branch origin/feature/JIRA1-add-component-A, I am creating a project with the following URL:
http://localhost:9000/dashboard?id=com.my.package%3Amy-project%3Aorigin%2Ffeature%2FJIRA1-add-component-A
I am using the Quality Gates plugin to fail the build in case the code's quality(as seen by SonarQube)is not good.
The problem is that Quality Gates needs my SonarQube project name, so in this case com.my.package%3Amy-project%3Aorigin%2Ffeature%2FJIRA1-add-component-A.
However, I can only specify it like this:
com.my.package:my-project:${GIT_BRANCH}
Which translates into this:
com.my.package:my-project:origin/feature/JIRA1-add-component-A

Something like this? Apologize if not, but dead tired lol.
def GIT_BRANCH = "origin/feature/JIRA1-add-component-A"
def BRANCH = "%A3" + "${GIT_BRANCH}".replace("/", "%2F")
println "${BRANCH}"
I'm sure it can be done easier BUT:
Pre: origin/feature/JIRA1-add-component-A
Post: %A3origin%2Ffeature%2FJIRA1-add-component-A

Jenkins, how to check regressions against another job

When you set up a Jenkins job various test result plugins will show regressions if the latest build is worse than the previous one.
We have many jobs for many projects on our Jenkins and we wanted to avoid having a 'job per branch' set up. So currently we are using a parameterized build to build eg different development branches using a single job.
But that means when I build a new branch any regressions are measured against the previous build, which may be for a different branch. What I really want is to measure regressions in a feature branch against the latest build of the master branch.
I thought we should probably set up a separate 'master' build alongside the parameterized 'branches' build. But I still can't see how I would compare results between jobs. Is there any plugin that can help?
UPDATE
I have started experimenting in the Script Console to see if I could write a post-build script... I have managed to get the latest build of master branch in my parameterized job... I can't work out how to get to the test results from the build object though.
The data I need is available in JSON at
http://<jenkins server>/job/<job name>/<build number>/testReport/api/json?pretty=true
...if I could just get at this data structure it would be great!
I tried using JsonSlurper to load the json via HTTP but I get 403, I guess because my script has no auth session.
I guess I could load the xml test results from disk and parse them in my script, it just seems a bit stupid when Jenkins has already done this.

I eventually managed to achieve everything I wanted, using a Groovy script in the Groovy Postbuild Plugin
I did a lot of exploring using the script console http://<jenkins>/script and also the Jenkins API class docs are handy.
Everyone's use is going to be a bit different as you have to dig down into the build plugins to get the info you need, but here's some bits of my code which may help.
First get the build you want:
def getProject(projectName) {
// in a postbuild action use `manager.hudson`
// in the script web console use `Jenkins.instance`
def project = manager.hudson.getItemByFullName(projectName)
if (!project) {
throw new RuntimeException("Project not found: $projectName")
}
project
}
// CloudBees folder plugin is supported, you can use natural paths:
project = getProject('MyFolder/TestJob')
build = project.getLastCompletedBuild()
The main test results (jUnit etc) seem to be available directly on the build as:
result = build.getTestResultAction()
// eg
failedTestNames = result.getFailedTests().collect{ test ->
test.getFullName()
}
To get the more specialised results from eg Violations plugin or Cobertura code coverage you have to look for a specific build action.
// have a look what's available:
build.getActions()
You'll see a list of stuff like:
[hudson.plugins.git.GitTagAction#2b4b8a1c,
hudson.scm.SCMRevisionState$None#40d6dce2,
hudson.tasks.junit.TestResultAction#39c99826,
jenkins.plugins.show_build_parameters.ShowParametersBuildAction#4291d1a5]
These are instances, the part in front of the # sign is the class name so I used that to make this method for getting a specific action:
def final VIOLATIONS_ACTION = hudson.plugins.violations.ViolationsBuildAction
def final COVERAGE_ACTION = hudson.plugins.cobertura.CoberturaBuildAction
def getAction(build, actionCls) {
def action = build.getActions().findResult { act ->
actionCls.isInstance(act) ? act : null
}
if (!action) {
throw new RuntimeException("Action not found in ${build.getFullDisplayName()}: ${actionCls.getSimpleName()}")
}
action
}
violations = getAction(build, VIOLATIONS_ACTION)
// you have to explore a bit more to find what you're interested in:
pylint_count = violations?.getReport()?.getViolations()?."pylint"
coverage = getAction(build, COVERAGE_ACTION)?.getResults()
// if you println it looks like a map but it's really an Enum of Ratio objects
// convert to something nicer to work with:
coverage_map = coverage.collectEntries { key, val -> [key.name(), val.getPercentageFloat()] }
With these building blocks I was able to put together a post-build script which compared the results for two 'unrelated' build jobs, then using the Groovy Postbuild plugin's helper methods to set the build status.
Hope this helps someone else.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart