Zip old builds in jenkins plugin - jenkins

There is some Jenkins plugin to ZIP old builds? I don't want to package just the generated archive (I am deleting those). I want to ZIP just the log data and the data used by tools, like FindBugs, Checkstyle, Surefire, Cobertura, etc.
Currently I am running out of disk space due to Jenkins. There are some build log files that reach 50 Mb due running 3000+ unit tests (most of these are severely broken builds full of stacktraces in which everything fails). But this happens frequently in some of my projects, so I get this for every bad build. Good builds are milder and may get up to around 15 Mb, but that is still a bit costly.
The surefile XML files for these are huge too. Since they tend to contain very repetitive data, I could save a lot of disk space by zipping them. But I know no Jenkins plugin for this.
Note: I am already deleting old builds not needed anymore.

The 'Compress Buildlog' plugin does pretty much exactly what you're asking for, for the logs themselves at least.
https://github.com/daniel-beck/compress-buildlog-plugin
For everything else, you'll probably want an unconditional step after your build completes that manually applies compression to other generated files that will stick around.

The administering Jenkins guide gives some guidance on how to do this manually. There are also links to the following plugins
Shelve Project
thinBackup
The last one is really designed to backup Jenkins configuration, but there are also options for build results.

Although this question is early 3 years ago there may be other people search the same question
here is my answer
If you want to compress the current build job's log using This jenkins plugin
If you want to compress the old jenkins jobs using the following script mtime +5 means the file change time is 5 days ago
cd "$JENKINS_HOME/jobs"
find * -name "log" -mtime +5|xargs gzip -9v '{}'

Related

In Jenkins, can we delete build artifacts for older builds, but keep build details/logs?

As is good practise, I've got Jenkins set up at work to automatically build everything for continuous integration, pulling files from our Git repositories. On our development branches, builds get kicked off automatically whenever anyone commits a change. When we want to do formal testing, we pull the build from Jenkins and use that; and when we want to sign off a change request, we quote the Jenkins build number where the change went in. So far, so good.
The problem we have is that builds are a significant size. For our SDK, we have to build across multiple platforms so that we can check it works on all of them. At maybe 50MB per build, this starts to mount up! Short term I can keep asking IT to give me more storage space, but longer term I'd like a more strategic solution
The obvious answer in Jenkins is to set up deletion rules, whether deleting after some time or after some number of builds. The problem then though is that if we delete that older development build, we lose the traceability of what we tested. I'm sure most engineers at one time or another have had to do a binary chop through older builds to find an obscure bug/regression which was only spotted some time later. For me, it is unacceptable to lose that history.
The important feature of build history though is not the binary build artifacts, but the build log recording what Git commits (or anything else; toolchain versions for example) went into each build. That's what lets us go back to investigate older builds and recreate them if required. The build log is relatively small (and highly compressible, being a text file). We do still need to keep build artifacts for recent builds though, so that testers can use them. So I'm thinking a better alternative would be to preserve the build log in Jenkins for all builds, but to have Jenkins automatically delete build artifacts after some time.
Does anyone know of a way in Jenkins (perhaps a plugin?) which would let us automatically delete/archive build artifacts from older builds, but still keep the build details and log for those builds? I'm happy to do a Jenkins upgrade if necessary to get this feature. And of course this needs to be only for selected development build jobs - all release build jobs need their build artifacts to be preserved forever, as do any builds which have the "keep forever" button ticked.
If it's absolutely necessary, I could set up a separate cron job to do this on the Jenkins file area. That's a nasty hack though, and I suspect it's likely to cause some issues with Jenkins, so I'd rather not do something that brute-force if there's a better alternative.
I think you need this option in your jenkinsfile
buildDiscarder(logRotator(artifactNumToKeepStr: '10'))
artifactNumToKeepStr: This number of builds have their artifacts kept.

How to display configuration differences between two jenkins Jenkins builds?

I want to display non-code differences between current build and the latest known successful build on Jenkins.
By non-code differences I mean things like:
Environment variables (includes Jenkins parameters) (set), maybe with some filter
Version of system tool packages (rpm -qa | sort)
Versions of python packages installed (pip freeze)
While I know how to save and archive these files as part of the build, the only part that is not clear is how to generate the diff/change-report regarding differences found between current build and the last successful build.
Please note that I am looking for a pipeline compatible solution and ideally I would prefer to make this report easily accessible on Jenkins UI, like we currently have with SCM changelogs.
Or to rephrase this, how do I create build manifest and diff it against last known successful one? If anyone knows a standard manifest format that can easily be used to combine all these information it would be great.
you always ask the most baller questions, nice work. :)
we always try to push as many things into code as possible because of the same sort of lack of traceability you're describing with non-code configuration. we start with using Jenkinsfiles, so we capture a lot of the build configuration there (in a way that still shows changes in source control). for system tool packages, we get that into the app by using docker and by inheriting from a specific tag of the docker base image. so even if we want to change system packages or even the python version, for example, that would manifest as an update of the FROM line in the app's Dockerfile. Even environment variables can be micromanaged by docker, to address your other example. There's more detail about how we try to sidestep your question at https://jenkins.io/blog/2017/07/13/speaker-blog-rosetta-stone/.
there will always be things that are hard to capture as code, and builds will therefore still fail and be hard to debug occasionally, so i hope someone pipes up with a clean solution to your question.

VSTS agent very slow to download artifacts from local network share

I'm running an on-prem TFS instance with two agents. Agent 1 has a local path where we store our artifacts. Agent 2 has to access that path over a network path (\agent1\artifacts...).
Downloading the artifacts from agent 1 takes 20-30 seconds. Downloading the artifacts from agent 2 takes 4-5 minutes. If from agent 2 I copy the files using explorer, it takes about 20-30 seconds.
I've tried adding other agents on other machines. All of them perform equally poorly when downloading the artifacts but quick when copying manually.
Anyone else experience this or offer some ideas of what might work to fix this?
Yes It's definitely the v2 that's causing the problem.
Our download artifacts step has gone from 2mins to 36mins. Which is completely unacceptable. Im going to try out agent v2.120.2 to see if that's any better...
Agent v2.120.2
I think it's because of the amount of files in our artifacts, we have 3.71GB across 12,042 files in 2,604 Folders!
The other option I will look into it zipping or creating a nuget package for each public artifact and then after the drop, unzipping! Not the ideal solution but something I've done before when needing to use RoboCopy which is apparently what this version of the Agent uses.
RoboCopy is not great at handling lots of small files, and having to create a handle for each file across the network adds a lot of overhead!
Edit:
The change to the newest version made no difference. We've decided to go a different route and use an Artifact type of "Server" rather than "File Share" which has sped it up from 26 minutes to 4.5 minutes.
I've found the source of my problem and it seems to be the v2 agent.
Going off of Marina's comment I tried to install a 2nd agent on 01 and it had the exact same behavior as 02. I tried to figure out what was different and then I noticed 01's agent version is 1.105.7 and the new test instance is 2.105.7. I took a stab in the dark and installed the 1.105.7 on my second server and they now have comparable artifact download times.
I appreciate your help.

jenkins: performance plugin shows only recent build data in the perf report

I set up jmeter job in jenkins, which supposed to publish *.jtl results and then display them in a nice trend graph.
But, depite that I see that they're published under the builds//performance-results/JMeter folders, the trend always shows only current day results. So if I run this build three times during a day - I'll see graph with a three points. If it was just one run today - I'll see 1 run on that graph. I don't see yesterday and etc results on graph. I'd like to see this trend to display all the data from all the previous builds, including yesteday, etc.
What should I check, how perf plugin decides which *.jtl data to use to display data??
in settings of the job I have this regexp for jtl source: **/*.jtl, so I would expect all the builds data being displayed on the trend ...
Apparently the solution is very simple. Found it myself!
By default all jtl files had a timestamp at the beginning, thanks to jmeter-maven-plugin. Pattern was yyyyMMdd. Trend report in jenkins displayed last build results. And because of the pattern jtl results for all builds run this day were the same, and were different for previous day.
So, easiest solution was to remove that timestamp from the results file name.
<testResultsTimestamp>false</testResultsTimestamp>
in configuration part of jmeter-maven-plugin in the pom file.
Annoying, is that Performance plugin guys haven't put it into the documentation, - the requirement for results file to have the same name in order to be displayed on the graph...
Apart from that, there is an issue with Performance plugin (version 1.12 and 1.13). Due to that, the LastReport (image doesn't show) and other reports are showing missing info.
To fix it, either you can download/git clone the latest code from Performance Plugin github repo and build it locally (using mvn clean install and you'll get performance.hpi Jenkins plugin file) OR revert back to Performance plugin 1.11 version.
As 1.12/1.13 have some other enhancements over 1.11, I selected to build myself, until someone will fix Performance Plugin and come up with a recent release version (aka 1.14 containing the fix for this issue).
Issue: https://issues.jenkins-ci.org/browse/JENKINS-27100

Using Jenkins, Perforce, and Ant, how can I run PMD only on files that have changed since the last green build?

Given that:
There seems to be no easy way to get a list of "changed" files in Jenkins (see here and here)
There seems to be no fast way to get a list of files changed since label xxxx
How can I go about optimising our build so that when we run PMD it only runs against files that have been modified since the last green build.
Backing up a bit… our PMD takes 3–4 minutes to run against ~1.5 million lines of code, and if it finds a problem the report invariably runs out of memory before it completes. I'd love to trim a couple of minutes off of our build time and get a good report on failures. My original approach was that I'd:
get the list of changes from Jenkins
run PMD against a union of that list and the contents of pmd_failures.txt
if PMD fails, include a list of failing files in pmd_failures.txt
More complicated than I'd like, but worth having a build that is faster but still reliable.
Once I realised that Jenkins was not going to easily give me what I wanted, I realised that there was another possible approach. We label every green build. I could simply get the list of files changed since the label and then I could do away with the pmd_failures.txt entirely.
No dice. The idea of getting a list of files changed since label xxxx from Perforce seems to have never been streamlined from:
$ p4 files //path/to/branch/...#label > label.out
$ p4 files //path/to/branch/...#now > now.out
$ diff label.out now.out
Annoying, but more importantly even slower for our many thousands of files than simply running PMD.
So now I'm looking into trying to run PMD in parallel with other build stuff, which is still wasted time and resources and makes our build more complex. It seems to me daft that I can't easily get a list of changed files from Jenkins or from Perforce. Has anyone else found reasonable workaround for these problems?
I think I've found the answer, and I'll mark my answer as correct if it works.
It's a bit more complex than I'd like, but I think it's worth the 3-4 minutes saved (and potential memory issues).
At the end of a good build, save the good changelist as a Perforce counter. (post-build task). Looks like this:
$ p4 counter last_green_trunk_cl %P4_CHANGELIST%
When running PMD, read the counter into the property last.green.cl and get the list of files from:
$ p4 files //path/to/my/branch/...#${last.green.cl},now
//path/to/my/branch/myfile.txt#123 - edit change 123456 (text)
//path/to/my/branch/myotherfile.txt#123 - add change 123457 (text)
etc...
(have to parse the output)
Run PMD against those files.
That way we don't need the pmd_failures.txt and we only run PMD against files that have changed since the last green build.
[EDIT: changed it to use p4 counter, which is way faster than checking in a file. Also, this was very successful so I will mark it as answered]
I'm not 100% sure since I've never use Perforce with Jenkins, but I believe Perforce passes the changelist number through the environment variable $P4_CHANGELIST. With that, you can run the p4 filelog -c $P4_CHANGELIST which should give you the files from that particular changelist. From there, it shouldn't be hard to script something up to just get the changed files (plus the old failures into PMD).
I haven't use Perforce in a long time, but I believe the -Ztag parameter makes it easier to parse P4 output for the various scripting languages.
Have you thought about using automatic labels? They're basically just an alias for a changelist number, so it's easier to get the set of files that differ between two automatic labels.

Resources