I have a Job in a pipeline that cleans up docker images. It runs the job on each worker individually. This is frustrating because when I add jenkins-cpu-worker3, I'll have to update this job.
I'd like to run this job in such a way that it runs on all workers without having to update it each time a new worker is present. I also want the job to be able to run regardless of what I name each worker. It needs to run on all workers no matter what.
Is there a way to query jenkins from within the pipeline to get me a list or array of all the workers that exist. I was leafing through documentation and posts online and I have not found a solution that works. If possible I'd like to do this without any additional Jenkins Plugins.
pipeline {
agent any
stages {
stage('Cleanup jenkins-cpu-worker1') {
agent {
node {
label 'jenkins-cpu-worker1'
}
}
steps {
sh "docker container prune -f"
sh "docker image prune -f"
sh '''docker images | awk '{print $1 ":" $2}' | xargs docker image rm || true'''
sh "docker network prune -f"
sh "docker volume prune -f"
}
}
stage('Cleanup jenkins-cpu-worker2') {
agent {
node {
label 'jenkins-cpu-worker2'
}
}
steps {
sh "docker container prune -f"
sh "docker image prune -f"
sh '''docker images | awk '{print $1 ":" $2}' | xargs docker image rm || true'''
sh "docker network prune -f"
sh "docker volume prune -f"
}
}
Here is an improved version of your Pipeline. This will dynamically get all the active agents, and run your cleanup task in parallel.
pipeline {
agent any
stages {
stage('CleanupWorkers') {
steps {
script {
echo "Something"
parallel parallelJobs()
}
}
}
}
}
def parallelJobs() {
jobs = [:]
for (def nodeName in getAllNodes()) {
jobs[nodeName] = getStage(nodeName)
}
return jobs
}
def getStage(def nodeName){
return {
stage("Cleaning $nodeName") {
node(nodeName){
sh'''
echo "Srating cleaning"
docker container prune -f
docker image prune -f
docker images | awk '{print $1 ":" $2}' | xargs docker image rm || true
docker network prune -f
docker volume prune -f
'''
}
}
}
}
def getAllNodes() {
def nodeNames = []
def jenkinsNodes = Jenkins.instance.getNodes().each { node ->
// Ignore offline agents
if (!node.getComputer().isOffline()){
nodeNames.add(node.getNodeName())
}
}
return nodeNames
}
Related
If I'm running parallel tasks across multiple agents, instead of having a post action on each job, how can I have it on the parallel stage itself? The issue with having them all on each stage is that if the stage is sharing an agent with another stage, docker system prune -f -a can interfere with the other stage.
I am trying to ensure the cache is clear on the build agent, so it's always pulling the latest fresh from Artifactory. Plus cleaning up after the build to make sure nothing is left dangling.
For example
stage("Docker stuff"){
parallel{
stage(foo){
agent{label "docker"}
steps{...}
}
stage(bar){
agent{label "docker"}
steps{...}
}
stage(foobar){
agent{label "docker"}
steps{...}
}
}
post{
always{
sh "docker image prune -f"
sh "docker volume prune -f"
sh "docker container prune -f"
sh "docker system prune -a -f"
sh "docker builder prune -a -f"
}
}
}
The issue with the above is that the post actions run only on the agent selected to manage the parallel stages, not the agents actually doing the running.
The issue with the below is the post actions can interfere with other stages.
stage("Docker stuff"){
parallel{
stage(foo){
agent{label "docker"}
steps{
script{
def dockerImage = _dockerImage("docker-local/foo-image")
docker.withRegistry( "https://artifactory-dev.company.com/", "svc_bar") {
dockerImage.push()
dockerImage.push("latest")
}
}
}
post{
always{
echo "========always========"
sh "docker image prune -f"
sh "docker volume prune -f"
sh "docker container prune -f"
sh "docker system prune -a -f"
sh "docker builder prune -a -f"
}
}
}
stage(bar){
agent{label "docker"}
steps{...}
post{...}
}
stage(foobar){
agent{label "docker"}
steps{...}
post{...}
}
}
}
stage(''){
steps {
script {
timeout(time: 24, unit: 'HOURS') {
try {
parallel(
a: {
},
b: {
},
c: {
}
)
}catch (any){
//currentBuild.result = 'FAILURE'
}finally{
}
}
}
}
}
post {
always {
}
}
It seems to me that you cannot perform certain actions in docker at the same time, in different nodes I do not see a problem
I want to delete image from previous build. I'm able to get its image id, however the job dies every time it hits docker rmi command.
stage('Clean old Image') {
steps {
script {
def imageName = "${registry}" + "/" + "${branchName}"
env.imageName = "${imageName}"
def oldImageID = sh(
script: 'docker images -qf reference=\${imageName}:\${imageTag}',
returnStdout: true
)
echo "Image Name: " + "${imageName}"
echo "Old Image: ${oldImageID}"
if ( "${oldImageID}" != '' ) {
echo "Deleting image id: ${oldImageID}..."
sh 'docker rmi -f $oldImageID'
} else {
echo "No image to delete..."
}
}
}
}
stage log console shows these error messages
Shell Script -- docker rmi -f $oldImageID (self time 282ms)
+ docker rmi -f
"docker rmi" requires at least 1 argument.
See 'docker rmi --help'.
Usage: docker rmi [OPTIONS] IMAGE [IMAGE...]
Remove one or more images
but actually, the image id is already persists as it shows in stage log
Print Message -- Old Image: 267848fadb74 (self time 11ms)
Old Image: 267848fadb74
Try passing in " instead of ' with ${oldImageID}
sh "docker rmi -f ${oldImageID}"
I've inherited this Jenkinsfile stage that will run a new docker image using withRun:
stage('Deploy') {
steps {
script {
docker.image('deployscript:latest').withRun("""\
-e 'IMAGE=${IMAGE_NAME}:${BUILD_ID}' \
-e 'CNAME=${IMAGE_NAME}' \
-e 'PORT=${PORT_1}:80' \
-e 'PORT=${PORT_2}:443'""") { c ->
sh "docker logs ${c.id}"
}
}
}
}
However, I believe this method is only meant for testing purposes and actually stops the container once the block is finished. I want this step to actually run the container and stop/restart the previous one if necessary. The documentation out there on this is surprisingly sparse. Please help.
If you want to run the docker container throughout all the stages, thenthe example would look like below:
Scripted Pipeline
node('master') {
/* Requires the Docker Pipeline plugin to be installed */
docker.image('alpine:latest').inside {
stage('01') {
sh 'echo STAGE01'
}
stage('02') {
sh 'echo STAGE02'
}
}
}
Declarative Pipeline
pipeline {
agent {
docker {
image 'alpine:latest'
label 'master'
args '-v /tmp:/tmp'
}
}
stages {
stage('01') {
steps {
sh "echo STAGE01"
}
}
stage('02') {
steps {
sh "echo STAGE02"
}
}
}
}
In both scripted and declarative pipelines, The docker container from the alpine image will active for all the stages to finish and only delete if the stage is a success or failure.
But If you would want to control start, stop, restart the container yourself on different stages, you can do it with bash command or by writing a small groovy script wrapping the docker command like below
node {
stage('init') {
docker create --name myImage1 -v $(pwd):/var/jenkins -w /var/jenkins imageName:tag
}
stage('build') {
// make use of docker command to start, stop and execute some script inside the container
// same goes for other stage
//once all done you can remove the container
docker rm myImage1
}
}
The following will stop the existing container and run a new one with the new image:
stage('Deploy') {
steps {
sh "docker stop ${IMAGE_NAME} || true && docker rm ${IMAGE_NAME} || true"
sh "docker run -d \
--name ${IMAGE_NAME} \
--publish ${PORT}:443 \
${IMAGE_NAME}:${BUILD_ID}"
}
}
In my jenkinsfile I have this
stage ('Build Docker') {
steps {
script {
image1 = docker.build "docker1:${BRANCH_NAME}"
}
script {
image2 = docker.build "docker2:${BRANCH_NAME}"
}
}
}
stage ('Run Docker Acceptance Tests') {
steps {
script {
container1 = image1.run "-v /tmp/${BRANCH_NAME}:/var/lib/data"
container1Id = container1.id
container1IP = sh script: "docker inspect ${container1Id} | grep IPAddress | grep -v null| cut -d \'\"\' -f 4 | head -1", returnStdout: true
}
//let containers start up
sleep 20
script {
container2= image2.run("-v /tmp/${BRANCH_NAME}:/var/lib/data --add-host=MY_HOST:${container1IP}")
}
}
}
When it gets to run container2 I get this output.
[resources] Running shell script
00:01:33.775 + docker run -d -v /tmp/master:/var/lib/data --add-host=MY_HOST:172.17.0.3
00:01:33.775 "docker run" requires at least 1 argument(s).
00:01:33.775 See 'docker run --help'.
Clearly its not appending the container name when running the docker image.
I tried just hardcoding in the IP address to test if it worked like this
container2= image2.run("-v /tmp/${BRANCH_NAME}:/var/lib/data --add-host=MY_HOST:172.17.0.3")
And then it worked and ran the command correctly
00:00:29.386 [resources] Running shell script
00:00:29.641 + docker run -d -v /tmp/master:/var/lib/data --add-host=MY_HOST:172.17.0.3 docker-name:branch
I dont understand why its not picking up the container image name.
I have even tried doing this - getting the same error
container2= image2.run("-v /tmp/${BRANCH_NAME}:/var/lib/data --add-host=MY_HOST:${container1IP} docker2:${BRANCH_NAME}")
My final step I tried
sh "docker run -v /tmp/${BRANCH_NAME}:/var/lib/data --add-host=MY_HOST:${container1IP} docker2:${BRANCH_NAME}"
Again it seems like it is stripping off the final command after resolving ${container1IP}
managed to fix it, it was due to a hidden new line char
container1IP = sh (script: "docker inspect ${container1Id} | grep IPAddress | grep -v null| cut -d \'\"\' -f 4 | head -1", returnStdout: true).trim()
Trimming the var fixed it
I'm trying to execute an SSH command from inside a Docker container in a Jenkins pipeline. I'm using the CloudBees Docker Pipeline Plugin to spin up the container and execute commands, and the SSH Agent Plugin to manage my SSH keys. Here's a basic version of my Jenkinsfile:
node {
step([$class: 'WsCleanup'])
docker.image('node').inside {
stage('SSH') {
sshagent (credentials: [ 'MY_KEY_UUID' ]) {
sh "ssh -vvv -o StrictHostKeyChecking=no ubuntu#example.org uname -a"
}
}
}
}
When the SSH command runs, I get this error:
+ ssh -vvv -o StrictHostKeyChecking=no ubuntu#example.org uname -a
No user exists for uid 1005
I combed through the logs and realized the Docker Pipeline Plugin is automatically telling the container to run with the same user that is logged in on the host by passing a UID as a command line argument:
$ docker run -t -d -u 1005:1005 [...]
I decided to check what users existed in the host and the container by running cat /etc/passwd in each environment. Sure enough, the list of users was different in each. 1005 was the jenkins user on the host machine, but that UID didn't exist in the container. To solve the issue, I mounted /etc/passwd from the host to the container when spinning it up:
node {
step([$class: 'WsCleanup'])
docker.image('node').inside('-v /etc/passwd:/etc/passwd') {
stage('SSH') {
sshagent (credentials: [ 'MY_KEY_UUID' ]) {
sh "ssh -vvv -o StrictHostKeyChecking=no ubuntu#example.org uname -a"
}
}
}
}
The solution provided by #nathan-thompson is awesome, but in my case I was unable to find the user even in the /etc/passwd of the host machine! It means mounting the passwd file did not fix the problem. This question https://superuser.com/questions/580148/users-not-found-in-etc-passwd suggested some users are logged in the host using an identity provider like LDAP.
The solution was finding a way to add the proper line to the passwd file on the container. Calling getent passwd $USER on the host will provide the passwd line for the Jenkins user running the container.
I added a step running on the node (and not the docker agent) to get the line and save it in a file. Then in the next step I mounted the generated passwd to the container:
stages {
stage('Create passwd') {
steps {
sh """echo \$(getent passwd \$USER) > /tmp/tmp_passwd
"""
}
}
stage('Test') {
agent {
docker {
image '*******'
args '***** -v /tmp/tmp_passwd:/etc/passwd'
reuseNode true
registryUrl '*****'
registryCredentialsId '*****'
}
}
steps {
sh """ssh -i ********
"""
}
}
}
I just found another solution to this problem, that I want to share. It differentiates from the existing solutions in that it allows to run the complete pipeline in one agent, instead of per stage.
The trick is to, instead of directly using an image, refer to a Dockerfile (which may be build FROM the original) and then add the user:
# Dockerfile
FROM node
ARG jenkinsUserId=
RUN if ! id $jenkinsUserId; then \
usermod -u ${jenkinsUserId} jenkins; \
groupmod -g ${nodeId} jenkins; \
fi
// Jenkinsfile
pipeline {
agent {
dockerfile {
additionalBuildArgs "--build-arg jenkinsUserId=\$(id -u jenkins)"
}
}
}
agent {
docker {
image 'node:14.10.1-buster-slim'
args '-u root:root'
}
}
environment {
SSH_deploy = credentials('e99988ea-6bdc-45fc-b9e1-536b875bcac7')
}
stage('build') {
steps {
sh '''#!/bin/bash
eval $(ssh-agent -s)
cat $SSH_deploy | tr -d '\r' | ssh-add -
touch .env
echo 'REACT_APP_BASE_API = "//172.22.132.115:8080"' >> .env
echo 'REACT_APP_ADMIN_PANEL_URL = "//172.22.132.115"' >> .env
yarn install
CI=false npm run build
ssh -t -o StrictHostKeyChecking=no root#172.22.132.115 'rm -rf /usr/local/src/build'
scp -r -o StrictHostKeyChecking=no build root#172.22.132.115:/usr/local/src/
ssh -t -o StrictHostKeyChecking=no root#172.22.132.115 'systemctl restart nginx'
'''
}
From the solution provided by Nathan Thompson, I modified it this way for Jenkins DOCKER build container which runs inside a Jenkins DOCKER-slave. #docker in docker
if (validated_parameters.custom_gradle_image){
docker.image(validated_parameters.custom_gradle_image).inside(" -v /etc/passwd:/etc/passwd -v /var/lib/jenkins/.ssh/:/var/lib/jenkins/.ssh/ "){
sshagent(['jenkins-git-io']){
sh "${gradleCommand}"
}