Jenkins: remove repo after build, keep only log - jenkins

Jenkins is checking out my projects' repositories every time for each build and each PR. This quickly fills up the disk (only 10Gb) as each checkout amounts to 300Mb and there are 5 projects (all in range of 300-500Mb per project). We've already set Discard old items with empty values but it doesn't seem to delete the files once the PR has another build.
I've noticed the files are stored at:
/data/versioning/config/jobs/MyProjectAbc/branches/PR-9424/workspace#script/
Is there an option for Jenkins to delete the whole PR-xxxx/workspace#script folder and only keep the PR-xxxx/builds folder?
Lightweight checkouts currently aren't possible (possibly because of outdated plugin, Bitbucket Branch Source 2.2.8)

You could use the Workspace Cleanup Plugin as a post-step in your pipeline to clear out the workspace after each build.
Something like:
pipeline {
post {
always {
cleanWs()
}
}
}

Related

Jenkins: stash vs archiveArtifacts

What are the use cases and pros/cons for using stash vs archiveArtifacts?
The documentation mentions each:
i.e. https://jenkins.io/doc/pipeline/steps/workflow-basic-steps/#stash-stash-some-files-to-be-used-later-in-the-build
and
https://jenkins.io/doc/pipeline/tour/tests-and-artifacts/
but doesn't do a comparison.
stash is used to "save" some files in a pipeline stage and reuse them on a different slave (unstash). Stash is only useful when you have a small set of files. It will become very slow when you want to stash a big amount of data. If you need to stash a lot of files it's recommended to use a shared filesystem between your slaves so the content of your workspace can be used by multiple slaves.
Archiving artifacts will save artifacts on the master slave. You can specify if you only want to archive the generated artifacts from the last build or more. This is useful when you have some deploy job on your master to deploy the artifacts after a succesful run or to make them available in your jenkins console.
From the latest Pipeline Syntax documentation and Options directive:
https://jenkins.io/doc/book/pipeline/syntax/#options
preserveStashes
Preserve stashes from completed builds, for use with stage restarting. For example: options { preserveStashes() } to preserve the stashes from the most recent completed build, or options { preserveStashes(buildCount: 5) } to preserve the stashes from the five most recent completed builds.
In theory this seems much the same as using archiveArtifacts with buildDiscarder option to apply artifact retention policy.

Jenkins multibranch pipeline only for subfolder

I have git monorepo with different apps. Currently I have single Jenkinsfile in root folder that contains pipeline for app alls. It is very time consuming to execute full pipeline for all apps when commit changed only one app.
We use GitFlow-like approach to branching so Multibranch Pipeline jobs in Jenkins as perfect fit for our project.
I'm looking for a way to have several jobs in Jenkins, each one will be triggered only when code of appropriate application was changed.
Perfect solution for me looks like this:
I have several Multibranch Pipeline jobs in Jenkins. Each one looks for changes only to given directory and subdirectories. Each one uses own Jenkinsfile. Jobs pull git every X minutes and if there are changes to appropriate directories in existing branches - initiates build; if there are new branches with changes to appropriate directories - initiates build.
What stops me from this implementation
I'm missing a way to define commit to which folders must be ignored during scan execution by Multibranch pipeline. "Additional behaviour" for Multibranch pipeline doesn't have "Polling ignores commits to certain paths" option, while Pipeline or Freestyle jobs have. But I want to use Multibranch pipeline.
Solution described here doesnt work for me because if there will be new branch with changes only to "project1" then whenever Multibranch pipeline for "project2" will be triggered it will discover this new branch anyway and build it. Means for every new branch each of my Multibranch pipelines will be executed at least once no matter if there was changes to appropriate code or not.
Appreciate any help or suggestions how I can implement few Multibranch pipelines watching over same git repository but triggered only when appropriate pieces of code changed
This can be accomplished by using the Multibranch build strategy extension plugin. With this plugin, you can define a rule where the build only initiates when the changes belong to a sub-directory.
Install the plugin
On the Multibranch pipeline configuration, add a Build strategy
Select Build included regions strategy
Put a sub-folder on the field, such as subfolder/**
This way the changes will still be discovered, but they won't initiate a build if it doesn't belong to a certain set of files or folders.
This is the best approach I'm aware so far. But I think the best way would be a case where the changes doesn't even get discovered.
Edit: Gerrit Code Review plugin configuration
In case you're using the Gerrit Code Review plugin, you can also prevent new changes to be discovered by using a custom query:
I solved this by creating a project that builds other projects depending on the files changed. For example, from your repo root:
/Jenkinsfile
#!/usr/bin/env groovy
pipeline {
agent any
options {
timestamps()
}
triggers {
bitbucketPush()
}
stages {
stage('Build project A') {
when {
changeset "project-a/**"
}
steps {
build 'project-a'
}
}
stage('Build project B') {
when {
changeset "project-b/**"
}
steps {
build 'project-b'
}
}
}
}
You would then have other Pipeline projects with their own Jenkinsfile (i.e., project-a/Jenkinsfile).
I know that this post is quite old, but I solved this problem by changing the "include branches" parameter for SVN repositories (this can possibly also be done using the property "Filter by name (with wildcards)" for git repos). Instead of supplying only the actual branch name, I also included the subfolder. So instead of only supplying "trunk", I used "trunk/subfolder". This limits scanning to only that specific directory. Note that I have not yet fully tested this solution.

Is there a way to run a pre-checkout step in declarative Jenkins pipelines?

Jenkins declarative pipelines offer a post directive to execute code after the stages have finished. Is there a similar thing to run code before the stages are running, and most importantly, before the SCM checkout?
For example something along the lines of:
pre {
always {
rm -rf ./*
}
}
This would then clean the workspace of my build before the source code is checked out.
pre is a cool feature idea, but doesn't exist yet. skipDefaultCheckout and checkout scm (which is the same as the default checkout) are the keys:
pipeline {
agent { label 'docker' }
options {
skipDefaultCheckout true
}
stages {
stage('clean_workspace_and_checkout_source') {
steps {
deleteDir()
checkout scm
}
}
stage('build') {
steps {
echo 'i build therefore i am'
}
}
}
}
For the moment there are no pre-build steps but for the purpose you are looking for, it can be done in the pipeline job configurarion and also multibranch pipeline jobs, when you define where is your jenkinsfile, choose Additional Behaviours -> Wipe out repository & force clone.
Delete the contents of the workspace before building, ensuring a fully fresh workspace.
If you do not really want to delete everything and save some network usage, you can just use this other option: Additional Behaviours -> Clean before checkout.
Clean up the workspace before every checkout by deleting all untracked files and directories, including those which are specified in .gitignore. It also resets all tracked files to their versioned state. This ensures that the workspace is in the same state as if you cloned and checked out in a brand-new empty directory, and ensures that your build is not affected by the files generated by the previous build.
This one will not delete the workspace but just reset the repository to the original state and pull new changes if there are some.
I use "Prepare an environment for the run / Script Content"

How to manage multiple Jenkins pipelines from a single repository?

At this moment we use JJB to compile Jenkins jobs (mostly pipelines already) in order to configure about 700 jobs but JJB2 seems not to scale well to build pipelines and I am looking for a way to drop it from the equation.
Mainly i would like to be able to have all these pipelines stored in a single centralized repository.
Please note that keeping the CI config (Jenkinsfile) inside each repository and branch is not possible in our use case, we need to keep all pipelines in a single "jenkins-jobs.git" repo.
As far as I know this is not possible yet, but in progress. See: https://issues.jenkins-ci.org/browse/JENKINS-43749
I think this is the purpose of jenkins shared libraries
I didn't dev such library my-self but I am using some. Basically:
Develop the "shared code" of the jenkins pipeline in a shared library
it can contains the whole pipeline (seq of steps)
Add this library to the jenkins server
In each project, add a jenkinsfile that "import" those using #Library
as #Juh_ said, you can use jenkins shared libraries, here is a complete steps, Suppose that we have three branches:
master
develop
stage
and we want to create a single Jenkins file so that we can change in only one place. All you need is creating a new branch ex: common. This branch MUST have this structure. What we are interested for now is adding a new groovy file in vars directory, ex: common.groovy. Here we can put the common Jenkins file that you wish to be used across all branches.
Here is a sample:
def call() {
node {
stage("Install Stage from common file") {
if (env.BRANCH_NAME.equals('master')){
echo "npm install from common files master branch"
}
else if(env.BRANCH_NAME.equals('develop')){
echo "npm install from common files develop branch"
}
}
stage("Test") {
echo "npm test from common files"
}
}
}
You must wrap your code call function in order to be used in other branches. now we have finished work in common branch we need to use it in our branches. go to any branch you wish to use this pipline ex: master and create Jenkinsfile and put this one line of code:
common()
This will call the common function that you have created before in common branch and will execute the pipeline.

Jenkins pipeline share information between jobs

We are trying to define a set of jobs on Jenkins that will do really specific actions. JobA1 will build maven project, while JobA2 will build .NET code, JobB will upload it to Artifactory, JobC will download it from Artifactory and JobD will deploy it.
Every job will have a set of parameters so we can reuse the same job for any product (around 100).
The idea behind this is to create black boxes, I call a job with some input and I get always some output, whatever happens between is something that I don't care. On the other side, this allows us to improve each job separately, adding the required complexity, and instantly all products will get benefit.
We want to use Jenkins Pipeline to orchestrate the execution of actions. We are going to have a pipeline per environment/usage.
PipelineA will call JobA1, then JobB to upload to artifactory.
PipelineB will download package JobC and then deploy to staging.
PipelineC will download package JobC and then deploy to production based on some internal validations.
I have tried to get some variables from JobA1 (POM basic stuff such as ArtifactID or Version) injected to JobB but the information seems not to be transfered.
Same happens while downloading files, I call JobC but the file is in the job workspace not available for any other and I'm afraid that"External Workspace Manager" plugin adds too much complexity.
Is there any way rather than share the workspace to achieve my purpose? I understand that share the workspace will make it impossible to run two pipelines at the same time
Am I following the right path or am I doing something weird?
There are two ways to share info between jobs:
You can use stash/unstash to share the files/data between multiple jobs in a single pipeline.
stage ('HostJob') {
build 'HostJob'
dir('/var/lib/jenkins/jobs/Hostjob/workspace/') {
sh 'pwd'
stash includes: '**/build/fiblib-test', name: 'app'
}
}
stage ('TargetJob') {
dir("/var/lib/jenkins/jobs/TargetJob/workspace/") {
unstash 'app'
build 'Targetjob'
}
In this manner, you can always copy the file/exe/data from one job to the other. This feature in pipeline plugin is better than Artifact as it saves only the data locally. The artifact is deleted after a build (helps in data management).
You can also use Copy Artifact Plugin.
There are two things to consider for copying an artifact:
a) Archive the artifacts in the host project and assign permissions.
b) After building a new job, select the 'Permission to copy artifact' → Projects to allow copy artifacts: *
c) Create a Post-build Action → Archive the artifacts → Files to archive: "select your files"
d) Copy the artifacts required from host to target project.
Create a Build action → Copy artifacts from another project → Enter the ' $Project name - Host project', which build 'e.g. Lastest successful build', Artifacts to copy '$host project folder', Target directory '$localfolder location'.
The first part of your question(to pass variables between jobs) please use the below command as a post build section:
post {
always {
build job:'/Folder/JobB',parameters: [string(name: 'BRANCH', value: "${params.BRANCH}")], propagate: false
}
}
The above post build action is for all build results. Similarly, the post build action could be triggered on the current build status. I have used the BRANCH parameter from current build(JobA) as a parameter to be consumed by 'JobB' (provide the exact location of the job). Please note that there should be a similar parameter defined in JobB.
Moreover, for sharing the workspace you can refer this link and share the workspace between the jobs.
You could use the Pipelines shared groovy libraries plugin. Have a look at its documentation to implement libraries that multiple pipelines share and define shared global variables.

Resources