When do the jenkins workspaces get preserved? - jenkins

I have a bunch of pipeline jobs, yet when executed, workspaces of some get preserved, some are deleted. How does jenkins make these decisions?
Based on my findings so far:
All jobs executed on nodes will have their workspace persisted, e.g. /home/ec2-user/workspaces/some-job
Some works on master keep their workspaces but some others' workspaces disappear after the job has finished. For example, after my build job succeeded, if I ssh in I can see the its workspace directory; but all my e2e jobs have no workspace.
Note I didn't use any of clearWs, deleteDir etc in my pipelines.
By the way, the reason I'm looking into workspaces is the disk usage keeps increasing and I want to cleanup. I thought the workspace is overwritten each time a job runs, but yet I get the 'Disk space is too low' warning several times.

Jenkins is creating a new workspace for every build job (= run) per default. You can see that in the path of the ws in your console log: /here/is/the/ws#buildnumber. If you dont want to have that behavior you can set it to an path which is for instance for every repo the same: How to set specific workspace folder for jenkins multibranch pipeline projects

Maybe some of your jobs don't get executed on the Jenkins Master, but on some connected Node (via an agent directive within your Jenkinsfile or Pipeline description). If that's the case you won't see a build directory inside the workspace for this Job on the Jenkins master, but on the connected Node.
You would only get the build results (like artifacts, reports, etc.) under /<JENKINS_HOME>/jobs/My_Job/ on the Master.
Remember that you could trigger a Jenkins build on a node also indirectly if you, for example, run the build within a Dockerfile and have configured (within Jenkins configuration) a specific node label for execution of Docker builds.

Related

Is the Jenkins workspace on the master or the worker?

Who does the actual cloning of the project, is it the master or the agent node? If it is the master, then how does the agent node actually execute the job. If it is the agent node, how can we view the workspace in the browser?
When people ask "where is the workspace" the answer is usually a path, but I am more interested in where that path is, on the master or the agent node? Or maybe it is both?
Edit1
Aligned terminology to this: https://jenkins.io/doc/book/glossary/ in order to avoid confusion.
In a Jenkins set up all the machines are considered nodes. The master node connects to one or more agent nodes. Executors can run both on the master or agent nodes.
In my scenario, no executors run on the master. They are run only on the agent nodes.
The answer is: it depends !
First of all, although it is not a good practice IMO, some installation let the master be an actual worker and run jobs. In this case, the workspace will be on the master.
If you configured the master not to accept jobs, there are still occasion when a workspace can be created on the master. A good example is when your job is a "pipeline script from SCM". In this case, the master will create a workspace for the job, clone the target repo, read the pipeline, and start needed jobs on whatever slave is targeted, creating a workspace to run the actions themselves. If the pipeline targets multiple slaves, there will be a workspace on each of them.
In simple situation (e.g. maven or freestyle job), the workspace will only be on the targeted slave.
I needed to dig a bit deeper to understand this.
I ran a brand new instance of Jenkins and I attached a single agent node. I used SSH and I set the remote (agent) root directory to: /home/igorski/jenkins
As soon as I attached the node the remoting folder and remoting.jar showed up in that root directory.
I ran a basic Gradle Java pipeline job (Jenkinsfile in the project).
The workspace showed up on the slave. Not on the master.
From the Jenkins GUI I can access the workspace and see it's contents.
At the moment I kill the agent machine I can no longer view the workspace in Jenkins.
My guess is that the remoting.jar somehow does a live sync.
I also ran a freestyle project and I can confirm the same. As soon as the agent is killed I can no longer open the Workspace and I get an error stack trace:
hudson.remoting.Channel$CallSiteStackTrace: Remote call to JenkoOne
This was much more obvious with the Pipeline job though. There you get a link to the agent that you need to click in order to see the contents. As soon as the agent is gone the link is disabled. And you know exactly on which agent the node is. With freestyle jobs, you just get a Workspace link. There is no indication on what agent it is or if the agent is accessible at the moment.
So, both Zeitounator and fabian were correct.

How to checkout and run pipeline file from TFS on specific node in Jenkins?

I am trying to run a pipeline job that get its' pipeline file from TFS but the mapping of the workspace and the checkout is done on the Master instead of the Slave.
I have Jenkins-master which is installed on a linux machine and I connected a windows machine as a slave to it. I created a pipeline job with 'Pipeline script from SCM' option selected for TFS.
How can I make the windows slave run that pipeline job?
The master can't run that job because it is running on linux and it fails when it is trying to map a workspace to TFS in order to download the pipeline script and run it.
Even if I create another pipeline job and select to hard-code a script to run my original pipeline job like this:
node('WIN_SLAVE') {
build job: 'My_Pipeline'
}
It doesn't work.
And I can see in the output that the initiali script (above) is in fact running on my windows slave, but when it's building the job 'My_Pipeline' it still tries to map a workspace to the Jenkins-master at it's linux machine path /var/jenkins/... and it fails.
If the initial pipeline script ran at the windows slave, why does the other pipeline script not running on the same node? Why is it trying again to checkout the pipeline file from TFS to the Jenkins-Master?
How can I make the windows slave checkout the pipeline file and run it?
Here are some things to check...
Make sure you disabled the original job, or you are completely redefining it for running on the slave, because you indicated you set up “another job” for the slave. It appears that this other job is just triggering the previous job, rather than defining its own specifications. When the job is ran on the slave, it’s just running whatever settings are in that original job.
Also, If you have the box checked to build when a change is pushed to TFS, then your original job could still be trying to run every time a change is made to TFS.
Verify the slaves Remote root directory is set properly in the slave configuration under Manage Jenkins -> Manage Nodes.
Since this slave job is triggering the other job you originally created on the master, then it will build on the master as expected.
Instead of referencing the My_Pipeline job, change the My_Pipeline job itself to run on the slave. If you are using a declarative Pipeline for the original job, then change that original job to run on the slave within the original job settings. You can do it similarly to how you have indicated above, just define the node in the original job.
If the original job is a freestyle project, there is a checkbox titled Restrict where this project can be run. Check that and include the name of the slave in the Label Expression. When you run the job, it will then be restricted to the slave.
Lastly, posting the My_Pipeline job will be helpful.

Inconsistent Jenkins workspace path on slave machines

We have some jobs set up which share a workspace. The workflow for the various branches is:
Build a big honking C++ project called foo.
Execute several downstream tests, each of which uses the workspace of foo.
We accomplish this by assigning the Use custom workspace field of the downstream jobs to the build workspace.
Recently, we took one branch and assigned it to be build on a Jenkins slave machine rather than on the master. I was surprised to find that on master, the foo repository was cloned to $JENKINS_JOBS_PATH/FOO/workspace/foo_repo - while on the slave, the repository was cloned to $JENKINS_JOBS_PATH/FOO/foo_repo.
Is this by design, or have we somehow configured master and slave inconsistently?
Older versions of Jenkins put the workspace under the ${JENKINS_HOME}/jobs/JOB/workspace directories. After upgrading, this pattern stays with the Jenkins instance. New versions put the workspaces in ${JENKINS_HOME}/workspace/. I suspect the slaves don't need to follow the old pattern (especially if it is a newer slave), so the directories may not be consistent across machines.
You can change the location of the workspaces on the master in Jenkins -> Configure Jenkins -> Advanced.
I think the safe way to handle this... If you are going to use a custom workspace, you should use that for all of your jobs, including the first one that builds the big honking c++ project.
If you did this all in a pipeline, you can run all of this in a single job and have more control over where all the files are, and you have the option of stash and unstash, but if the files are huge, stash may not be the way to go.
You can omit 'Use custom workspace' option for each job and instead change master and/or slave workspace paths and use
%WORKSPACE%/../foo_repo path
or (that equal)
./../foo_repo path
In that case
%WORKSPACE% = [master or slave node workspace]/[job name]
and
%WORKSPACE%/../ = [master or slave node workspace]

Jenkins Multibranch Pipeline - Issues with deleting jobs

Use case: Using Jenkinsfile to auto create builds for branches
Summary:
For a variety of reasons sometimes the Jenkins master fails to connect to the SCM server. When this occurs Jenkins deletes that job directory on master, because it no longer sees the branches. However, the slaves are not cleaned up and so they still have the old workspace paths (which are uniquely named based on the build # in my setup). When the Jenkins master reconnects to the SCM server, it recreates a new job folder on master, and the build counter is reset to #1.
This creates the following issues:
When a build starts, it executes on a slave. Since master has a new counter the job is #1. But this path may already exist from a previous build on that slave, so the artifact is built with content that was checked out for the original old build (i.e. maven uses the /target directory inside the workspace which already existed from previous build). So the end result is an artifact that potentially has the wrong code.
This can create build storms. After the connection issues are resolved, Jenkins will see all the repositories and branches with Jenkinsfiles and start to build them. So in a setup of let's say 20 repositories with 10 branches each, this will create 200 new builds. This increases with additional repositories and branches. This is obviously not desired.
Solutions:
One quick solution I can think of is to update the Jenkinsfile to delete the workspace if it exists before running the job inside of it. But this is just a work around. I would not want to mask the connection issues and would like to retain the actual build history of a pipeline (not have it keep erasing itself).
Minimize connection issues. This obviously will not always be guaranteed though. Plus sometimes maintenance must force servers offline. While I can construct maintenance in a way to limit or work around such issues, there still will be rare cases where downtime is required across the board. It would be best if Jenkins could handle this use case.
I'm curious if anyone has ran into this issue and what the thoughts are on this problem?

Jenkins. Copy a file from multiconfiguration child job's workspace to parent job's workspace

I have a multiconfiguration job myjob. It has a BUILD_TYPES axis with possible values release and debug. myjobis not running on Jenkins master, but on one of the slaves. Since it is a multiconfig job, it is spawning two more builds on slaves (for a total of three slaves working on one myjob.)
myjob also triggers a test. The test is triggered with Parameters from file option. This file is written in each configuration (debug and release), and I cannot find a way to copy it from the workspace of the child jobs to the parent job's workspace. I tried Copy to Slave, Copy Artifacts from another project, and some other things to no avail. What is the general approach to copy files from children to parent (all running on different slaves)?

Resources