Is the Jenkins workspace on the master or the worker? - jenkins

Who does the actual cloning of the project, is it the master or the agent node? If it is the master, then how does the agent node actually execute the job. If it is the agent node, how can we view the workspace in the browser?
When people ask "where is the workspace" the answer is usually a path, but I am more interested in where that path is, on the master or the agent node? Or maybe it is both?
Edit1
Aligned terminology to this: https://jenkins.io/doc/book/glossary/ in order to avoid confusion.
In a Jenkins set up all the machines are considered nodes. The master node connects to one or more agent nodes. Executors can run both on the master or agent nodes.
In my scenario, no executors run on the master. They are run only on the agent nodes.

The answer is: it depends !
First of all, although it is not a good practice IMO, some installation let the master be an actual worker and run jobs. In this case, the workspace will be on the master.
If you configured the master not to accept jobs, there are still occasion when a workspace can be created on the master. A good example is when your job is a "pipeline script from SCM". In this case, the master will create a workspace for the job, clone the target repo, read the pipeline, and start needed jobs on whatever slave is targeted, creating a workspace to run the actions themselves. If the pipeline targets multiple slaves, there will be a workspace on each of them.
In simple situation (e.g. maven or freestyle job), the workspace will only be on the targeted slave.

I needed to dig a bit deeper to understand this.
I ran a brand new instance of Jenkins and I attached a single agent node. I used SSH and I set the remote (agent) root directory to: /home/igorski/jenkins
As soon as I attached the node the remoting folder and remoting.jar showed up in that root directory.
I ran a basic Gradle Java pipeline job (Jenkinsfile in the project).
The workspace showed up on the slave. Not on the master.
From the Jenkins GUI I can access the workspace and see it's contents.
At the moment I kill the agent machine I can no longer view the workspace in Jenkins.
My guess is that the remoting.jar somehow does a live sync.
I also ran a freestyle project and I can confirm the same. As soon as the agent is killed I can no longer open the Workspace and I get an error stack trace:
hudson.remoting.Channel$CallSiteStackTrace: Remote call to JenkoOne
This was much more obvious with the Pipeline job though. There you get a link to the agent that you need to click in order to see the contents. As soon as the agent is gone the link is disabled. And you know exactly on which agent the node is. With freestyle jobs, you just get a Workspace link. There is no indication on what agent it is or if the agent is accessible at the moment.
So, both Zeitounator and fabian were correct.

Related

Why declarative pipelines need to run on master if there are build executors available?

I'm using recent Jenkins version 2.286 and since this update there is an security hint: "You should set up distributed builds. Building on the controller node can be a security issue. See the documentation."
But I'm already doing so with three Jenkins nodes and I also fully understand the security implications.
The problem here is, that there are two jobs that need to run an the master, since they are the jobs that deploy those Jenkins nodes. That means I can not reduce the build executors to 0.
I've also tried using the Job Restrictions plugin to restrict which jobs can run on the master. This problem here is that all my jobs are waiting for the master queue do have a free slot available. I wonder why, because they all are declarative pipelines and define something like:
agent {
label 'some-different-node-label'
}
Which means they aren't really executed on the master node.
Questions here are:
Is this intentionally that all jobs require the master node before switching the agent?
Is there any configuration option to change that?
Is there a way to execute the deploy jobs on master, even if there aren't any executed defined (to bypass that behavior)?
Thanks.
With declarative pipelines the lightweight code checkout is done on the Master node to get a Jenkinsfile for that job. While this doesnt use an executor on the Master perhaps the Job Restriction Plugin is still blocking this (I havent used it before so cannot comment)
Also certain pipeline actions are delegated back to the Master node as well (e.g. the withAWSParameterStore step.
If you look at the console output for a Declarative pipeline job, you will see lots of output (mainly around library checkouts or git checkouts) before you see the start of the pipeline [Pipeline] Start of Pipeline. All that is done on the Master.
Unfortunately this cannot be changed as the Master needs to do this work to find out which agent type to delegate the job to.
Depending on how you are running you agents, you could use something like the EC2 Cloud Plugin to generate you agent nodes which wouldn't require a job to do it

When do the jenkins workspaces get preserved?

I have a bunch of pipeline jobs, yet when executed, workspaces of some get preserved, some are deleted. How does jenkins make these decisions?
Based on my findings so far:
All jobs executed on nodes will have their workspace persisted, e.g. /home/ec2-user/workspaces/some-job
Some works on master keep their workspaces but some others' workspaces disappear after the job has finished. For example, after my build job succeeded, if I ssh in I can see the its workspace directory; but all my e2e jobs have no workspace.
Note I didn't use any of clearWs, deleteDir etc in my pipelines.
By the way, the reason I'm looking into workspaces is the disk usage keeps increasing and I want to cleanup. I thought the workspace is overwritten each time a job runs, but yet I get the 'Disk space is too low' warning several times.
Jenkins is creating a new workspace for every build job (= run) per default. You can see that in the path of the ws in your console log: /here/is/the/ws#buildnumber. If you dont want to have that behavior you can set it to an path which is for instance for every repo the same: How to set specific workspace folder for jenkins multibranch pipeline projects
Maybe some of your jobs don't get executed on the Jenkins Master, but on some connected Node (via an agent directive within your Jenkinsfile or Pipeline description). If that's the case you won't see a build directory inside the workspace for this Job on the Jenkins master, but on the connected Node.
You would only get the build results (like artifacts, reports, etc.) under /<JENKINS_HOME>/jobs/My_Job/ on the Master.
Remember that you could trigger a Jenkins build on a node also indirectly if you, for example, run the build within a Dockerfile and have configured (within Jenkins configuration) a specific node label for execution of Docker builds.

How to checkout and run pipeline file from TFS on specific node in Jenkins?

I am trying to run a pipeline job that get its' pipeline file from TFS but the mapping of the workspace and the checkout is done on the Master instead of the Slave.
I have Jenkins-master which is installed on a linux machine and I connected a windows machine as a slave to it. I created a pipeline job with 'Pipeline script from SCM' option selected for TFS.
How can I make the windows slave run that pipeline job?
The master can't run that job because it is running on linux and it fails when it is trying to map a workspace to TFS in order to download the pipeline script and run it.
Even if I create another pipeline job and select to hard-code a script to run my original pipeline job like this:
node('WIN_SLAVE') {
build job: 'My_Pipeline'
}
It doesn't work.
And I can see in the output that the initiali script (above) is in fact running on my windows slave, but when it's building the job 'My_Pipeline' it still tries to map a workspace to the Jenkins-master at it's linux machine path /var/jenkins/... and it fails.
If the initial pipeline script ran at the windows slave, why does the other pipeline script not running on the same node? Why is it trying again to checkout the pipeline file from TFS to the Jenkins-Master?
How can I make the windows slave checkout the pipeline file and run it?
Here are some things to check...
Make sure you disabled the original job, or you are completely redefining it for running on the slave, because you indicated you set up “another job” for the slave. It appears that this other job is just triggering the previous job, rather than defining its own specifications. When the job is ran on the slave, it’s just running whatever settings are in that original job.
Also, If you have the box checked to build when a change is pushed to TFS, then your original job could still be trying to run every time a change is made to TFS.
Verify the slaves Remote root directory is set properly in the slave configuration under Manage Jenkins -> Manage Nodes.
Since this slave job is triggering the other job you originally created on the master, then it will build on the master as expected.
Instead of referencing the My_Pipeline job, change the My_Pipeline job itself to run on the slave. If you are using a declarative Pipeline for the original job, then change that original job to run on the slave within the original job settings. You can do it similarly to how you have indicated above, just define the node in the original job.
If the original job is a freestyle project, there is a checkbox titled Restrict where this project can be run. Check that and include the name of the slave in the Label Expression. When you run the job, it will then be restricted to the slave.
Lastly, posting the My_Pipeline job will be helpful.

can we access the jenkins master home directory from the slave machine

I want to execute some operation on Jenkins slave machine but want to use the data present in jenkins master home directory, can I access the jenkins master home directory from the slave machine?
Please let me know any leads here!
You can not access them directly but you can copy the folder or files from master to slave & vice versa.
Plugin link
First of all, I think that one of the Jenkins/Cloudbees recommendation is to avoid builds on master.
Secondly, if you have n slaves with the same label (let's say BUILD) you just need to configure your job with this label so that it will run on any of the slave (jenkins will choose the less loaded) having this label, and the checkout will be made on that slave.

Inconsistent Jenkins workspace path on slave machines

We have some jobs set up which share a workspace. The workflow for the various branches is:
Build a big honking C++ project called foo.
Execute several downstream tests, each of which uses the workspace of foo.
We accomplish this by assigning the Use custom workspace field of the downstream jobs to the build workspace.
Recently, we took one branch and assigned it to be build on a Jenkins slave machine rather than on the master. I was surprised to find that on master, the foo repository was cloned to $JENKINS_JOBS_PATH/FOO/workspace/foo_repo - while on the slave, the repository was cloned to $JENKINS_JOBS_PATH/FOO/foo_repo.
Is this by design, or have we somehow configured master and slave inconsistently?
Older versions of Jenkins put the workspace under the ${JENKINS_HOME}/jobs/JOB/workspace directories. After upgrading, this pattern stays with the Jenkins instance. New versions put the workspaces in ${JENKINS_HOME}/workspace/. I suspect the slaves don't need to follow the old pattern (especially if it is a newer slave), so the directories may not be consistent across machines.
You can change the location of the workspaces on the master in Jenkins -> Configure Jenkins -> Advanced.
I think the safe way to handle this... If you are going to use a custom workspace, you should use that for all of your jobs, including the first one that builds the big honking c++ project.
If you did this all in a pipeline, you can run all of this in a single job and have more control over where all the files are, and you have the option of stash and unstash, but if the files are huge, stash may not be the way to go.
You can omit 'Use custom workspace' option for each job and instead change master and/or slave workspace paths and use
%WORKSPACE%/../foo_repo path
or (that equal)
./../foo_repo path
In that case
%WORKSPACE% = [master or slave node workspace]/[job name]
and
%WORKSPACE%/../ = [master or slave node workspace]

Resources