Is there a groovy way of loading Jenkins builds? - jenkins

We have a kind of blue-green deployment which means we have 2 instances working at the same time and the difference between them is that one instance has few builds that don't exist on the other Jenkins.
We need now to rsync missing build folders from one Jenkins to other and cannot restart the second Jenkins that gets these builds.
So we would like to know a groovy way of loading competed builds from disk into Jenkins job history.
I saw loadBuild API and was happy for a moment but then I noticed it's protected.
protected R loadBuild(File dir)
throws IOException
Loads an existing build record from disk.
Throws:
IOException
This is to get an idea what we want to achieve.

Related

Cleanup Jenkins home directory

We have started to use jenkins from last few months and now the size of home directory is about 50GB. I noticed that size of Jobs and workspace directories are about 20 GB each. How can I clean them? What kind of strategy I should use?
Consider the various Jenkins areas that tend to grow excessively. The key areas are: system logs, job logs, artifact storage and job workspaces. This will detail options to best manage each of these.
System logs
System logs may be found in <JENKINS_HOME>/logs or /var/log/jenkins/jenkins.log, depending on your installation. By default, Jenkins does not always include log rotation (logrotate), especially if running straight from the war. The solution is to add logrotate. This Cloudbees post and my S/O response add details.
You can also set the Jenkins System Property hudson.triggers.SafeTimerTask.logsTargetDir to relocate the logs outside the <JENKINS_HOME>. Why answered later.
Job Logs
Each job has an option to [ X ] Discard old builds. As of LTS 2.222.1, Jenkins introduced a Global Build discarder (pull #4368) with similar options and default actions. This is a global setting, Prior to that, job logs (and artifacts) were retained forever by default (not good).
Advanced options can manage artifact retention (from post-build action, "Archive the artifacts" separately.
What's in Jobs directory?
The Jobs directory contains a directory for every job (and folders if you use them). Inside the directory is the job config.xml (a few KB in size), plus a directory builds. builds has a numbered directory holding the build logs for each retained build, a copy of the config.xml at runtime and possibly some additional files of record (changelog.xml, injectedEnvVars.txt). IF you chose the Archive the artifacts option, there's also an archive directory, which contains the artifacts from that build.
Jenkins System Property, jenkins.model.Jenkins.buildsDir, lets you relocate the builds to outside the <JENKINS_HOME>
Why Relocate logs outside <JENKINS_HOME>?
I would strongly recommend relocating both the system logs and the job / build logs (and artifacts). By moving the system logs and build logs (and artifacts if ticked) outside of <JENKINS_HOME>, what's left is the really important stuff to back and restore Jenkins and jobs in the event of disaster or migration. Carefully read and understand the steps "to support migration of existing build records" to avoid build-related errors. It also makes it much easier to analyze which job logs are consuming all the space and why (ie: logs vs artifacts).
Workspaces
Workspaces are where the source code is checked out and the job (build) is executed. Workspaces should be ephemeral. Best Practicesare to start with an empty workspace and clean up when you are done - use Workspace Cleanup ( cleanWS() ) plugun, unless necessary.
The OP's indication of a workspaces in the Jenkins controller suggests jobs are being run on the master. say that's not a good (or secure) practice, except lightweight pipelines always execute on master. Mis-configured job pipelines will also fall back to master (will try find reference). You can set up a node physically running on the same server as the master for better security.
You can use cleanws() EXCLUDE and INCLUDE patterns to selectively clean the workspace if delete all is not viable.
There are two Jenkins System Properties to control the location of the workspace directory. For the master: jenkins.model.Jenkins.workspacesDir and for the nodes/agents: hudson.model.Slave.workspaceRoot. Again, as these are ephemeral, get them out of <JENKINS_HOME> so you can better manage and monitor.
Finally, one more space consideration...
Both maven and npm cache artifacts in a local repository. Typically that is located in the user's $HOME directory. If incrementing versions often, that content will get stale and bloated. It's a cache, so take a time hit every once in a while and purge it or otherwise mange the content.
However, it's possible to relocate the cache elsewhere through maven and npm settings. Also, if running a maven step, every step has the Advanced option to have a private repository. That is located within in job's workspace. The benefit is you know what your build is using; no contamination. The downside though is massive duplication and wasted space if all jobs have private repos and you never clean them out or delete the workspaces, or longer build times every time if you cleaned. Consider using s the cleanWS() or a separate job to purge as needed.
The workspaces can be cleaned after and/or prior any execution. I recommend doing it prior and after an execution. After the build, do it only on successful builds. In case of errors, you can enter the workspaces and check there for any clue. You do it on the pipeline using the CleanWs() command.
For jobs directory you can select on your jobs the amount of time / maximum of executions to store. This is more complicated because it depends on what you want to save. For example if there is a lot of builds and you don't mind deleting that information you could save 10 builds during 30 days . That configuration is on the job configuration under job properties and search for "Discard old builds" and "Days to keep build" and "Max # builds to keep".
My suggestion is that you use larger numbers at first and then you can test how it behaves

What happens to data on a CI pipeline

I've been asked to create a CI pipeline for a project at my work, I'm creating a load test with JMeter and Taurus so I plan to integrate it with Jenkins to build all the pipeline. I'm just starting on this field and a question that came to my mind is:
What happens to all the data created by the Load Test? does it goes to the deploy phase or it gets deleted once the test is done, should I clean after the tests end?
The data is being kept in the Jenkins workspace and by default it will be kept in the file system forever.
If you decide to publish the artifacts they will be available at Jenkins build dashboard via the web interface.
You might also be interested in Jenkins Performance Plugin which allows plotting performance trend charts and conditionally marking builds as unstable or failed depending on pass/fail thresholds
Example configuration can be found in the How to Run a Taurus Test through Jenkins Pipelines article
I am not completely familiar with your setup but as far as I can see from a quick research, JMeter does the same as every other testing framework and generates HTML reports. Jenkins wont delete them, unless you explicitly delete them (rm file.html) or call cleanWs (clean workspace). If the job is deleted so are the files.
So the test result file should still be present in the deploy phase. You can use a plugin to collect the result. Or just archive it. Or do whatever fits your workflow.
There is generally no need to clean it up (you usually configure Jenkins to delete old builds which takes care of that)

Jenkins - Copy Artifacts from upstream job built in different node

There is a job controlled by Development team which built in a different node. I am on Testing team who want to take the artifacts and deploy on test device.
I can see those Artifacts from dev are stored in some path in dev's node. Does it means it must first archived in Jenkins master before I can copy it to my job?
I am using Copy Artifact plugin and constantly getting the error
Failed to copy artifacts from <dev-job> with filter: <path-in-dev-node>
*Some newbie question since i just moved from TeamCity
You probably want to use: Copy Artifact plugin.
Adds a build step to copy artifacts from another project.
Consider also, the Jenkins post-buid step "Archive the artifacts".
If you copy from the other job's workspace, what happens if another job is in progress or the workspace is wiped? That step copies them from the node to the master and stores a copy along with the build logs, etc. That makes them available via the UI as long as the build logs remain. It can take up space tho.
If you do use archive artifacts, consider using the system property jenkins.model.Jenkins.buildsDir to store all the build logs (and artifacts) outside of the jobs config directory. Some downtime and work required to separate the two (config / logs) .
You may also want to consider using a proper repository manager (Nexus / artifactory)
Finally, you may want to learn about using a Jenkins pipeline rather the relying on chained jobs, triggers or users and so forth. Why? 'cos it's much more controlled and easier to maintain.
ps: I'm not a huge fan of artifactDeployer, but it may work for you.
pps: you may want to review this in depth answer: Jenkis downstream job fails to find upstream artifacts

Jenkins Multibranch Pipeline - Issues with deleting jobs

Use case: Using Jenkinsfile to auto create builds for branches
Summary:
For a variety of reasons sometimes the Jenkins master fails to connect to the SCM server. When this occurs Jenkins deletes that job directory on master, because it no longer sees the branches. However, the slaves are not cleaned up and so they still have the old workspace paths (which are uniquely named based on the build # in my setup). When the Jenkins master reconnects to the SCM server, it recreates a new job folder on master, and the build counter is reset to #1.
This creates the following issues:
When a build starts, it executes on a slave. Since master has a new counter the job is #1. But this path may already exist from a previous build on that slave, so the artifact is built with content that was checked out for the original old build (i.e. maven uses the /target directory inside the workspace which already existed from previous build). So the end result is an artifact that potentially has the wrong code.
This can create build storms. After the connection issues are resolved, Jenkins will see all the repositories and branches with Jenkinsfiles and start to build them. So in a setup of let's say 20 repositories with 10 branches each, this will create 200 new builds. This increases with additional repositories and branches. This is obviously not desired.
Solutions:
One quick solution I can think of is to update the Jenkinsfile to delete the workspace if it exists before running the job inside of it. But this is just a work around. I would not want to mask the connection issues and would like to retain the actual build history of a pipeline (not have it keep erasing itself).
Minimize connection issues. This obviously will not always be guaranteed though. Plus sometimes maintenance must force servers offline. While I can construct maintenance in a way to limit or work around such issues, there still will be rare cases where downtime is required across the board. It would be best if Jenkins could handle this use case.
I'm curious if anyone has ran into this issue and what the thoughts are on this problem?

Delegate specific part of build to slave

I have a project where part of the build process is to create a native library on a remote machine. This is currently a manual process outside of the CI builds made by Jenkins.
The setup in question is that the Jenkins master server build a GIT based maven project, which has a dependency to a native library which can only be built on a specific machine. Jenkins can't compile this module, and because of this, it is currently a manual process.
I would like to install a Jenkins slave on the machine that creates the native library, and returns the compiled files to the Jenkins master, without handling any other parts of the build.
I am having trouble figuring out if this is even possible. The number of articles i have found on the subject discusses Jenkins slaves as a means of distributing the build, but i want the slave to take responsibility for a small part of the build process, and nothing else. The Jenkins master should just send the build request to the slave and wait for the result, instead of trying to compile the code itself.
I do exactly the same. My setup, very similar to what Mark O'Connor and gaige are advising, and I am using the Copy Artifact plugin.
job A: produces a zip file on a Mac
job B, runs on slave B - Windows machine, takes the zip as input and produces an MSI
Here's the important part in the config of job B:
restrict the job B on the proper slave using labels
make sure job B happens after job A
make sure artifacts from job A are sent to job B before your build
build your stuff
archive artifacts produced by job B
Delegating part of a job to a slave is something that would have to be done external to Jenkins, for example, using ssh.
However, as #kan indicates, you most likely want to extract the native library build as a separate job and then have that job execute on a particular slave, or any slave that meets a specific criteria.
To do this, my suggestion would be to use Labels in the node configurations to determine which slaves can be used for building that particular job.
In Jenkins > nodes > <slave node>, use the Labels property to set one-word labels that indicate your specific requirements, such as the OS or processor type.
Then, in the jobs that are node-specific, check Restrict where this project can be run and set the Label Expression to something that meets your criteria. If the criteria is simple, it will just be a single word, if you need a boolean, you can use those as well (such as OSX&&Lion in our case).
I believe this is all in the standard version of Jenkins, without need for a special plugin. Leave me a comment if it isn't and I'll try and diagnose which plugin enables this functionality.
This is problem is solved by using a binary repository manager to centralize your software artifacts. Personally I use Nexus, but it could be something as dumb as a remote file system.
The idea is to publish the built artifact after each Jenkins job (if you don't like Nexus, you could use one of the Publish over plugins) and retrieve it as a build dependency in the next job.
This approach means it longer matters where the build executes, and has the added advantage of decoupling the build of each module component.

Resources