Displaying slowest Jenkins pipelines in Grafana - jenkins

I set up a monitoring system to track our Jenkins pipelines using Prometheus and Grafana (see Jenkins Prometheus Plugin). I am building some dashboards and while doing so I tried to create a table graph that displays the 5 slowest pipelines. This is the Prometheus query I used:
topk (5, max by (jenkins_job) (default_jenkins_builds_last_build_duration_milliseconds / 60000))
Grafana Table Visual
However, instead of displaying 5 lines, the table shows numerous timestamps for every pipeline. Does anybody have an idea how to solve this? I tried numerous attemps discribed on stackoverflow (e.g. this), without success. Thanks in advance!
5 slowest Jenkins pipelines, one record each

Related

How to find the fluctuation of a metric for prometheus/grafana?

i am using prometheus and grafana to visualize data from a jenkins instance. I am using Jenkins metrics and prometheus metrics plugins to extract metrics for Prometheus, i have created a basic Grafana dashboard for instant metrics and some graphs and right now i need to create a promql query to extract the fluctuation from the last time the metric changes for the build time of a Jenkins job. I found out about changes() and rate() promql function but i don't get the result i am waiting.
The last query that i used was: changes(default_jenkins_builds_last_build_duration_milliseconds{jenkins_job="$project"}[1m])
where the variable $project let me select the job that i need to investigate in Grafana.
is that the right approach ???
do you have any alternative idea ???

Differentiate databricks streaming queries in datadog

I am trying to set up a dashboard on Datadog that will show me the streaming metrics for my streaming job. The job itself contains two tasks one task has 2 streaming queries and the other has 4 (Both tasks use the same cluster). I followed the instructions here to install Datadog on the driver node. However when I go to datadog and try to create a dashboard there is no way to differentiate between the 6 different streaming queries so they are all lumped together (none of the tags for the metrics are different per query).
After some digging I found there is an option you can enable via the init script called enable_query_name_tag which is disabled by default as it can cause there to be a ton of tags created when you are not using query names.
The modification is shown here:
instances:
- spark_url: http://\$DB_DRIVER_IP:\$DB_DRIVER_PORT
spark_cluster_mode: spark_standalone_mode
cluster_name: \${hostip}
streaming_metrics: true
enable_query_name_tag: true <----

Dataflow OutOfMemoryError while reading small tables from BigQuery

We have a pipeline reading data from BigQuery and processing historical data for various calendar years. It fails with OutOfMemoryError errors if the input data is small (~500MB)
On startup it reads from BigQuery about 10.000 elements/sec, after short time it slows down to hundreds elements/s then it hangs completely.
Observing 'Elements Added' on the next processing step (BQImportAndCompute), the value increases and then decreases again. That looks to me like some already loaded data is dropped and then loaded again.
Stackdriver Logging console contains errors with various stack traces that contain java.lang.OutOfMemoryError, for example:
Error reporting workitem progress update to Dataflow service:
"java.lang.OutOfMemoryError: Java heap space
at com.google.cloud.dataflow.sdk.runners.worker.BigQueryAvroReader$BigQueryAvroFileIterator.getProgress(BigQueryAvroReader.java:145)
at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation$SynchronizedReaderIterator.setProgressFromIteratorConcurrent(ReadOperation.java:397)
at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation$SynchronizedReaderIterator.setProgressFromIterator(ReadOperation.java:389)
at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation$1.run(ReadOperation.java:206)
I would suspect that there is a problem with topology of the pipe, but running the same pipeline
locally with DirectPipelineRunner works fine
in cloud with DataflowPipelineRunner on large dataset (5GB, for another year) works fine
I assume problem is how Dataflow parallelizes and distributes work in the pipeline. Are there any possibilities to inspect or influence it?
The problem here doesn't seem to be related to the size of the BigQuery table, but likely the number of BigQuery sources being used and the rest of the pipeline.
Instead of reading from multiple BigQuery sources and flattening them have you tried reading from a query that pulls in all the information? Doing that in a single step should simplify the pipeline and also allow BigQuery to execute better (one query against multiple tables vs. multiple queries against individual tables).
Another possible problem is if there is a high degree of fan-out within or after the BQImportAndCompute operation. Depending on the computation being done there, you may be able to reduce the fan-out using clever CombineFns or WindowFns. If you want help figuring out how to improve that path, please share more details about what is happening after the BQImportAndCompute.
Have you tried debugging with Stackdriver?
https://cloud.google.com/blog/big-data/2016/04/debugging-data-transformations-using-cloud-dataflow-and-stackdriver-debugger

How can I create nice graphs from the internals of Maven runs in Jenkins?

I would like to create some statistics for my Selenium UI-Test Job in Jenkins. Calculating the metrics in the maven Job is easy but is there any way to add a graph to the Jenkins Jobs with the numbers I generate?
For example I calculate the average site response time in my UI-Tests and add it with other metrics as an output artifact (e.g. a JSON document). How can I show a graph that displays those metrics over the previous x runs directly inside Jenkins?
I'm not entirely sure this is the correct stack exchange site so point me in the right direction if it isn't.
If it's not some default JUnit reporter (or some other popular reporters), then you can create your own html page and publish using https://wiki.jenkins-ci.org/display/JENKINS/HTML+Publisher+Plugin

Filter the Jenkins executor list to only the active ones

The company I work for uses a large Jenkins server with a large number of slaves to handle the build and testing for a particular product. The large number of slaves, many with numerous executor slots, makes for a very long list of slaves/executors in our typical Jenkins view.
Many users have asked if this view could be compressed, down to just a list of slaves with only those executors that are active appearing. For example; assume that Slave A has 2 executors, both idle and Slave B has 2 executors, one active. The display would look like this:
Slave A (2 available)
Slave B (1 available)
1: Building Job A
Instead of the typical view (using the same example):
Slave A
1: Idle
2: Idle
Slave B
1: Idle
2: Building Job A
I search for a plugin that would do this, or any native behavior, but didn't see anything like this. Does anyone know if it is possible and if so, how?
When you create a view, you can also set the view to show only the slaves that are relevant to the view.
Every user can also create their own views and set their own default view.
It is not exactly what you asked, but AFAIK this is the only way to limit the number slaves displayed.
We have exactly the same issue recently. So I made a simple plugin for it. It basically has a switch to either show or hide all the offline nodes. If anyone still needs it or want to add more features, you can go to this link and download the plugin.

Resources