Creating a structured Jenkins Failing Test Report - jenkins

The situation right now:
Every Monday morning I manually check Jenkins jobs jUnit results that ran over the weekend, using Project Health plugin I can filter on the timeboxed runs. I then copy paste this table into Excel and go over each test case's output log to see what failed and note down the failure cause. Every weekend has another tab in Excel. All this makes tracability a nightmare and causes time consuming manual labor.
What I am looking for (and hoping that already exists to some degree):
A database that stores all failed tests for all jobs I specify. It parses the output log of a failed test case and based on some regex applies a 'tag' e.g. 'Audio' if a test regarding audio is failing. Since everything is in a database I could make or use a frontend that can apply filters at will.
For example, if I want to see all tests regarding audio failing over the weekend (over multiple jobs and multiple runs) I could run a query that returns all entries with the Audio tag.
I'm OK with manually tagging failed tests and the cause, as well as writing my own frontend, is there a way (Jenkins API perhaps?) to grab the failed tests (jUnit format and Jenkins plugin) and create such a system myself if it does not exist?

A good question. Unfortunately, it is very difficult in Jenkins to get such "meta statistics" that spans several jobs. There is no existing solution for that.
Basically, I see two options for getting what you want:
Post-processing Jenkins-internal data to get the statistics that you need.
Feeding a database on-the-fly with build execution data.
The first option basically means automating the tasks that you do manually right now.
you can use external scripting (Python, Perl,...) to process Jenkins-internal data (via REST or CLI APIs, or directly reading on-disk data)
or you run Groovy scripts internally (which will be faster and more powerful)
It's the most direct way to go. However, depending on the statistics that you need and depending on your requirements regarding data persistance , you may want to go for...
The second option: more flexible and completely decoupled from Jenkins' internal data storage. You could implement it by
introducing a Groovy post-build step for all your jobs
that script parses job results and puts data of interest in a custom, external database
Statistics you'd get from querying that database.
Typically, you'd start with the first option. Once requirements grow, you'd slowly migrate to the second one (e.g., by collecting internal data via explicit post-processing scripts, putting that into a database, and then running queries on it). You'll want to cut this migration phase as short as possible, as it eventually requires the effort of implementing both options.

You may want to have a look at couchdb-statistics. It is far from a perfect fit, but at least seems to do partially what you want to achieve.

Related

Lead time for changes

I am working on some project where i need to generate lead time for changes per application, per day..
Is there any prometheus metric that provides lead time for changes ? and How we integrate it into a grafana dashboard?
There is not going to be a metric or dashboard out of the box for this, the way I would approach this problem is:
You will need to instrument your deployment code with the prometheus client library of your choice. The deployment code will need to grab the commit time, assuming you are using git, you can use git log filtered to the folder that your application is in.
Now that you have the commit date, you can do a date diff between that and the current time (after the app has been deployed to PRD) to get the lead time of X seconds.
To get it into prometheus, use the node_exporter (or windows_exporter) and their textfile collectors to read textfiles that your deployment code writes and surface them for prometheus to scrape. Most of the client libraries have logic to help you write these files, and even if there is not, the format of the textfiles is pretty easy to use by writing the files directly.
You will want to surface this as a gauge metric, and have a label to indicate which application was deployed. The end result will be a single metric that you can query from grafana or set up alerts that will work for any application/folder that you deploy. To mimic the dashboard that you linked to, I am pretty sure you will want to use the over_time functions.
I also want to note that it might be easier for you to store the deployment/lead time in a sql database/something other than prometheus and use that as a data source into grafana. For applications that do not deploy frequently you would easily run into missing series when querying by using prometheus as a datastore, and the overhead of setting up the node_exporters and the logic to manage the textfiles might outweigh the benefits if you can just INSERT into a sql table.

Can yaml tasks in a pipeline read and write historic data?

I am thinking about way of speeding up certain parts of our CI pipeline and came across a question that is not clear to me.
Given a scenario where I have to re-run a certain build on the same commit because of e.g. flaky tests, deployment errors or things later down the stream. All this requires some steps to always repeat, e.g. building code or run static code analysis again.
Also given: We retrigger the build from a pull request so we cannot manually disable certain steps like static code analysis.
My question is: Can I write a task that can report and query its state persistent over all build agents, without rolling my own cache?
ADO does this for instance when showing what "new" tests have failed compared to a previous run. To be able to detect "new" ADO needs a notion of a result cache that is stored on the server and be accessible.
What I am trying to evaluate is: Can I create a task that for instance queries a global ADO cache for e.g. "did for commit hash 424de2 static code analysis complete?" (Regardless on which build agent I currently am)

How can I coordinate integration tests in a multi-container (Docker) system?

I have inherited a system that consists of a couple daemons that asynchronously process messages. I am trying to find a clean way to introduce integration testing into this system with minimal impact/risk on the existing programs. Here is a very simplified overview of their responsibilities:
Process 1 polls a queue for messages, and inserts a row into a DB for each one it dequeues.
Process 2 polls the DB for rows inserted by Process 1, does some calculations, and then deposits a file into a directory on the host and sends an email.
These processes are quite old and complex, and I am strongly inclined to avoid modifying them in any way. What I would like to do is put each of them in a container, and also stand up the dependencies (queue, DB, mail server) in other containers. This part is straightforward, but what I'm unsure about is the best way to orchestrate these tests. Since these processes consume and generate output asynchronously I will need to poll or wait for the expected outcome (mail sent, file created).
Normally I would just write a series of tests in a single test suite of my language of choice (Java, Go, etc), and make the setUp / tearDown hooks responsible for resetting the environment to the desired state. But because these processes have a lot of internal state I am afraid I cannot successfully "clean up" properly after each distinct test. This would be a problem if, for example, one test failed to generate the desired output in a specific period of time so I marked it as failed, but a subsequent test falsely got marked as passed because the original test case actually did output something (albeit much slower than anticipated) that was mistakenly attributed to the subsequent test. For these reasons I feel I need to recreate the world between each test.
In order to do this the only options I can see are:
Use a shell script to actually run my tests -- having it bring up the containers, execute a single test file, and then terminate my containers for each test.
Follow my usual pattern of setUp / tearDown in my existing test framework but call out to docker to terminate and start up the containers between each test.
Am I missing another option? Is there some kind of existing framework or pattern used for this sort of testing?

Multiple export using google dataflow

Not sure whether this is the right place to ask but I am currently trying to run a dataflow job that will partition a data source to multiple chunks in multiple places. However I feel that if I try to write to too many table at once in one job, it is more likely for the dataflow job to fail on a HTTP transport Exception error, and I assume there is some bound one how many I/O in terms of source and sink I could wrap into one job?
To avoid this scenario, the best solution I can think of is to split this one job into multiple dataflow jobs, however for which it will mean that I will need to process same data source multiple times (once for which dataflow job). It is okay for now but ideally I sort of want to avoid it if later if my data source grow huge.
Therefore I am wondering there is any rule of thumb of how many data source and sink I can group into one steady job? And is there any other better solution for my use case?
From the Dataflow service description of structuring user code:
The Dataflow service is fault-tolerant, and may retry your code multiple times in the case of worker issues. The Dataflow service may create backup copies of your code, and can have issues with manual side effects (such as if your code relies upon or creates temporary files with non-unique names).
In general, Dataflow should be relatively resilient. You can Partition your data based on the location you would like it output. The writes to these output locations will be automatically divided into bundles, and any bundle which fails to get written will be retried.
If the location you want to write to is not already supported you can look at writing a custom sink. The docs there describe how to do so in a way that is fault tolerant.
There is a bound on how many sources and sinks you can have in a single job. Do you have any details on how many you expect to use? If it exceeds the limit, there are also ways to use a single custom sink instead of several sinks, depending on your needs.
If you have more questions, feel free to comment. In addition to knowing more about what you're looking to do, it would help to know if you're planning on running this as a Batch or Streaming job.
Our solution to this was to write a custom GCS sink that supports partitions. Though with the responses I got I'm unsure whether that was the right thing to do or not. Writing Output of a Dataflow Pipeline to a Partitioned Destination

scheduled task or windows service

My team is having a debate which is better: a windows service or scheduled tasks. We have a server dedicated to running jobs and currently they are all scheduled tasks. Some jobs take files, rename them and place them in other directories on the network. Other jobs extract data from SQL, modify it, and ship it elsewhere. Other jobs ftp files out. There is a lot of variety, but all in all, they are fairly straightforward.
I am partial to having each of these run as a windows service instead of a scheduled task because it is so much easier to monitor a windows service than a scheduled task. Some are diametrically opposed. In the end, none of us have that much experience to provide actual factual comparisons between the two methods. I am looking for some feedback on what other have experienced.
If it runs constantly - windows service.
If it needs to be run at various intervals - scheduled task.
Scheduled Task - When activity to be carried out on some fixed/predefined schedule. It take less memory and resources of OS. Not required installation. It can have UI (eg. Send reminder mail to defaulters)
Windows Service - When a continue monitoring is required. It makes OS busy by consuming more. Require install/uninstallation while changing version. No UI at all (eg. Process a mail as soon as it arrives)
Use them wisely
Sceduling jobs with the build in functionality is a perfectly valid use. You would have to recreate the full functionality in order to create a good service, and unless you want to react to speciffic events, I see no reason to move a nightly job into a service.
Its different when you want to process a file after it was posted in a folder, thats something I would create a service for, thats using the filesystem watcher to monitor a folder.
I think its reinventing the wheel
While there is nothing wrong with using the Task Scheduler, it is itself, a service. But we have the same requirements where I work and we have general purpose program that does several of these jobs. I interpreted your post to say that you would run individual services for each task, I would consider writing a single, database driven (service) program to do all your tasks, and that way, when you add a new one, it is simply a data entry chore, and not a whole new progam to write. If you practice change control, this difference is can be significant. If you have more than a few tasks the effort may be comperable. This approach will also allow you to craft a logging mechanism best suited to your operations.
This is a portion of our requirments document for our task program, to give you an idea of where to start:
This program needs to be database driven.
It needs to run as a windows service.
The program needs to be able to process "jobs" in the following manner:
Jobs need to be able to check for the existence of a source file, and take action based on the existence or not of the source file. (i.e proceed with processing, vs report that the file isn't there vs ignore it because it is not critical that the file isn't there.
Jobs need to be able to copy a file from a source to a target location or
Copy a file from source, to a staging location, perform "processing", and then copy either the original file or a result of the "processing" to the target location or
Copy a file from source, to a staging location, perform "processing", and the processing is the end result.
The sources and destination that jobs might copy to and from can be disparate: UNC, SFTP, FTP, etc.
The "processing", can be, encrypting/decrypting a file, parsing a data file for correct format, feeding the file to the mainframe via terminal emulation, etc., usually implemented by calling a command line passing parameters to an .exe
Jobs need to be able to clean up after themselves, as required. i.e. delete intermediate or original files, copy files to an archive location, etc.
The program needs to be able to determine the success and failure of each phase of a job and take appropriate action which would be logging, and possibly other notification, abort further processing on failure, etc.
Jobs need to be configured to activate at certain set times, or at certain intervals (optionally during certain set hours) i.e. every 15 mins from 9:00 - 5:00.
There needs to be a UI to add new jobs.
There needs to be a button to push to fire off a job as if a timer event had activated it.
The standard Display of the program should show an operator what is going on and whether the program is functioning properly.
All of this is predicated on the premise that it is a given that you write your own software. There are several enterprise task scheduler programs available on the market, as well. Buying off the shelf may be a better solution for you.

Resources