Running Docker commands inside Jenkins pipeline - docker

Is there a proper way to run Docker commands through a Jenkins containerized service?
I see there are many plugins to support Docker commands in the Jenkins ecosystem, although all of them raise errors because Docker isn't installed in the Jenkins container.
I have a Dockerfile that provides a Jenkins image with a working Docker installation, but to work I have to mount the host's Docker socket:
FROM jenkins/jenkins:lts
USER root
RUN apt-get -y update && \
apt-get -y install sudo \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
RUN add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/debian \
$(lsb_release -cs) \
stable"
RUN apt-get -y update && \
apt-get -y install --allow-unauthenticated \
docker-ce \
docker-ce-cli \
containerd.io
RUN echo "jenkins:jenkins" | chpasswd && adduser jenkins sudo
RUN echo jenkins ALL= NOPASSWD: ALL >> /etc/sudoers
USER jenkins
It can be run like this:
docker run -d -p 8080:8080 -v /var/run/docker.sock:/var/run/docker.sock
This way it's possible to run Docker commands inside the Jenkins container. Although, I am concerned about security: namely this way the Jenkins container can access all the containers running in the host machine, moreover Jenkins is a root user, which I wouldn't like for production.
I seek to run a Jenkins instance within a Kubernetes cluster to support CI and CD pipelines within that cluster, therefore I'm guessing Jenkins must be containerized.
Am I missing something?

Related

mkdir: cannot create directory ‘cpuset’: Read-only file system when running a "service docker start" in Dockerfile

I have a Dockerfile that extends the Apache Airflow 2.5.1 base image. What I want to do is be able to use docker inside my airflow containers (i.e. docker-in-docker) for testing and evaluation purposes.
My docker-compose.yaml has the following mount:
volumes:
- /var/run/docker.sock:/var/run/docker.sock
My Dockerfile looks as follows:
FROM apache/airflow:2.5.1
USER root
RUN apt-get update && apt-get install -y ca-certificates curl gnupg lsb-release nano
RUN mkdir -p /etc/apt/keyrings
RUN curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
RUN echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
$(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
RUN apt-get update
RUN apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
RUN groupadd -f docker
RUN usermod -a -G docker airflow
RUN service docker start
USER airflow
Basically:
Install docker.
Add the airflow user to the docker group.
Start the docker service.
Continue as airflow.
Unfortunately, this does not work. During RUN service docker start, I encounter the following error:
Step 11/12 : RUN service docker start
---> Running in 77e9b044bcea
mkdir: cannot create directory ‘cpuset’: Read-only file system
I have another Dockerfile for building a local jenkins image, which looks as follows:
FROM jenkins/jenkins:lts-jdk11
USER root
RUN apt-get update && apt-get install -y ca-certificates curl gnupg lsb-release nano
RUN mkdir -p /etc/apt/keyrings
RUN curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
RUN echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
$(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
RUN apt-get update
RUN apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
RUN groupadd -f docker
RUN usermod -a -G docker jenkins
RUN service docker start
USER jenkins
I.e. it is exactly the same, except that I am using the jenkins user. Building this image works.
I have not set any extraneous permission on my /var/run/docker.sock:
$ ls -la /var/run/docker.sock
srw-rw---- 1 root docker 0 Jan 18 17:14 /var/run/docker.sock
My questions are:
Why does RUN service start docker not work when building my airflow image?
Why does the exact same command in my jenkins Dockerfile work?
I've tried most of the answers to similar questions, e.g. here and here, but they have unfortunately not helped.
I'd rather try to avoid the chmod 777 /var/run/docker.sock solution if at all possible, and it should be since my jenkins image can build correctly...
Just delete the RUN service start docker line.
The docker CLI tool needs to connect to a Docker daemon, which it normally does through the /var/run/docker.sock Unix socket file. Bind-mounting the socket into the container is enough to make the host's Docker daemon accessible; you do not need to separately start Docker in the container.
There are several issues with the RUN service ... line specifically. Docker has a kind of complex setup internally, and some of the things it does aren't normally allowed in a container; that's probably related to the "cannot create directory" error. In any case, a Docker image doesn't persist running processes, so if you were able to start Docker inside the build, it wouldn't still be running when the container eventually ran.
More conceptually, a container doesn't "run services", it is a wrapper around only a single process (and its children). Commands like service or systemctl often won't work the way you expect, and I'd generally avoid them in a Docker context.

"docker:19.03-dind" could not select device driver "nvidia" with capabilities: [[gpu]]

I got a K8S+DinD issue:
launch Kubernetes cluster
start a main docker image and a DinD image inside this cluster
when running a job requesting GPU, got error could not select device driver "nvidia" with capabilities: [[gpu]]
Full error
http://localhost:2375/v1.40/containers/long-hash-string/start: Internal Server Error ("could not select device driver "nvidia" with capabilities: [[gpu]]")
exec to the DinD image inside of K8S pod, nvidia-smi is not available.
Some debugging and it seems it's due to the DinD is missing the Nvidia-docker-toolkit, I had the same error when I ran the same job directly on my local laptop docker, I fixed the same error by installing nvidia-docker2 sudo apt-get install -y nvidia-docker2.
I'm thinking maybe I can try to install nvidia-docker2 to the DinD 19.03 (docker:19.03-dind), but not sure how to do it? By multiple stage docker build?
Thank you very much!
update:
pod spec:
spec:
containers:
- name: dind-daemon
image: docker:19.03-dind
I got it working myself.
Referring to
https://github.com/NVIDIA/nvidia-docker/issues/375
https://github.com/Henderake/dind-nvidia-docker
First, I modified the ubuntu-dind image (https://github.com/billyteves/ubuntu-dind) to install nvidia-docker (i.e. added the instructions in the nvidia-docker site to the Dockerfile) and changed it to be based on nvidia/cuda:9.2-runtime-ubuntu16.04.
Then I created a pod with two containers, a frontend ubuntu container and the a privileged docker daemon container as a sidecar. The sidecar's image is the modified one I mentioned above.
But since this post is 3 year ago from now, I did spent quite some time to match up the dependencies versions, repo migration over 3 years, etc.
My modified version of Dockerfile to build it
ARG CUDA_IMAGE=nvidia/cuda:11.0.3-runtime-ubuntu20.04
FROM ${CUDA_IMAGE}
ARG DOCKER_CE_VERSION=5:18.09.1~3-0~ubuntu-xenial
RUN apt-get update -q && \
apt-get install -yq \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common && \
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - && \
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable" && \
apt-get update -q && apt-get install -yq docker-ce docker-ce-cli containerd.io
# https://github.com/docker/docker/blob/master/project/PACKAGERS.md#runtime-dependencies
RUN set -eux; \
apt-get update -q && \
apt-get install -yq \
btrfs-progs \
e2fsprogs \
iptables \
xfsprogs \
xz-utils \
# pigz: https://github.com/moby/moby/pull/35697 (faster gzip implementation)
pigz \
# zfs \
wget
# set up subuid/subgid so that "--userns-remap=default" works out-of-the-box
RUN set -x \
&& addgroup --system dockremap \
&& adduser --system -ingroup dockremap dockremap \
&& echo 'dockremap:165536:65536' >> /etc/subuid \
&& echo 'dockremap:165536:65536' >> /etc/subgid
# https://github.com/docker/docker/tree/master/hack/dind
ENV DIND_COMMIT 37498f009d8bf25fbb6199e8ccd34bed84f2874b
RUN set -eux; \
wget -O /usr/local/bin/dind "https://raw.githubusercontent.com/docker/docker/${DIND_COMMIT}/hack/dind"; \
chmod +x /usr/local/bin/dind
##### Install nvidia docker #####
# Add the package repositories
RUN curl -fsSL https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add --no-tty -
RUN distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && \
echo $distribution && \
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
tee /etc/apt/sources.list.d/nvidia-docker.list
RUN apt-get update -qq --fix-missing
RUN apt-get install -yq nvidia-docker2
RUN sed -i '2i \ \ \ \ "default-runtime": "nvidia",' /etc/docker/daemon.json
RUN mkdir -p /usr/local/bin/
COPY dockerd-entrypoint.sh /usr/local/bin/
RUN chmod 777 /usr/local/bin/dockerd-entrypoint.sh
RUN ln -s /usr/local/bin/dockerd-entrypoint.sh /
VOLUME /var/lib/docker
EXPOSE 2375
ENTRYPOINT ["dockerd-entrypoint.sh"]
#ENTRYPOINT ["/bin/sh", "/shared/dockerd-entrypoint.sh"]
CMD []
When I use exec to login into the Docker-in-Docker container, I can successfully run nvidia-smi (which previously return not found error then cannot run any GPU resource related docker run)
Welcome to pull my image at brandsight/dind:nvidia-docker

Dockerfile from Jenkins JDK 11 with Docker engine installed - Cannot connect to the Docker daemon

Ive created a Dockerfile that is based off jenkins/jenkins:lts-jdk11
Im trying to install docker + docker compose so that jenkins will have access to this when i create my pipeline for CD/CI.
Here is my Dockerfile:
FROM jenkins/jenkins:lts-jdk11 AS jenkins
WORKDIR /home/jenkins
RUN chown -R 1000:1000 /var/jenkins_home
USER root
# Install aws cli version 2
RUN apt-get update && apt-get install -y unzip curl vim bash sudo
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
RUN unzip awscliv2.zip
RUN ./aws/install
#Install docker cli command
RUN sudo apt-get update
RUN sudo apt install -y apt-transport-https ca-certificates curl gnupg lsb-release
RUN curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
RUN echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
RUN sudo apt-get update
RUN sudo apt-get install -y docker-ce docker-ce-cli containerd.io
##Install docker compose
RUN mkdir -p /usr/local/lib/docker/cli-plugins
RUN curl -SL https://github.com/docker/compose/releases/download/v2.2.3/docker-compose-linux-x86_64 -o /usr/local/lib/docker/cli-plugins/docker-compose
RUN chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
RUN sudo usermod -a -G docker jenkins
The docker commands work well within the container but as soon as i start to build an image it displays this error:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
If i try to start the docker service with service docker start i get the following error:
mkdir: cannot create directory ‘cpuset’: Read-only file system
Im not sure how to solve this one.
TIA
container does not use an init system. The Docker service cannot be started because of this.

How to deploy to a (local) Kubernetes cluster using Jenkins

This question is somewhat related to one of my previous questions as in it gives a clearer idea on what I am trying to achieve.. This question is about an issue I ran into when trying to achieve the task in that previous question...
I am trying to test if my kubectl works from within the Jenkins container. When I start up my Jenkins container I use the following command:
docker run \
-v /home/student/Desktop/jenkins_home:/var/jenkins_home \
-v $(which kubectl):/usr/local/bin/kubectl \ #bind docker host binary to docker container binary
-v ~/.kube:/home/jenkins/.kube \ #docker host kube config file stored in /.kube directory. Binding this to $HOME/.kube in the docker container
-v /var/run/docker.sock:/var/run/docker.sock \
-v $(which docker):/usr/bin/docker -v ~/.kube:/home/root/.kube \
--group-add 998
-p 8080:8080 -p 50000:50000
-d --name jenkins jenkins/jenkins:lts
The container starts up and I can login/create jobs/run pipeline scripts all no issue.
I created a pipeline script just to check if I can access my cluster like this:
pipeline {
agent any
stages {
stage('Kubernetes test') {
steps {
sh "kubectl cluster-info"
}
}
}
}
When running this job, it fails with the following error:
+ kubectl cluster-info // this is the step
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
error: the server doesn't have a resource type "services"
Thanks!
I'm not getting why there is:
-v $(which kubectl):/usr/local/bin/kubectl -v ~/.kube:/home/jenkins/.kube
/usr/local/bin/kubectl is a kubectl binary and ~/.kube:/home/jenkins/.kube should be the location where the kubectl binary looks for the cluster context file i.e. kubeconfig. First, you should make sure that the kubeconfig is mounted to the container at /home/jenkins/.kube and is accessible to kubectl binary. After appropriate volume mounts, you can verify by creating a session in the jenkins container with docker container exec -it jenkins /bin/bash and test with kubectl get svc. Make sure you have KUBECONFIG env var set in the session with:
export KUBECONFIG=/home/jenkins/.kube/kubeconfig
Before you run the verification test and
withEnv(["KUBECONFIG=$HOME/.kube/kubeconfig"]) {
// Your stuff here
}
In your pipeline code. If it works with the session, it should work in the pipeline as well.
I would personally recommend to create a custom Docker image for Jenkins which will contain kubectl binary and other utilities necessary (such as aws-iam-authenticator for AWS EKS IAM-based authentication) for working with Kubernetes cluster. This creates isolation between your host system binaries and your Jenkins binaries.
Below is the Dockerfile I'm using which contains, helm, kubectl and aws-iam-authenticator.
# This Dockerfile contains Helm, Docker client-only, aws-iam-authenticator, kubectl with Jenkins LTS.
FROM jenkins/jenkins:lts
USER root
ENV VERSION v2.9.1
ENV FILENAME helm-${VERSION}-linux-amd64.tar.gz
ENV HELM_URL https://storage.googleapis.com/kubernetes-helm/${FILENAME}
ENV KUBE_LATEST_VERSION="v1.11.0"
# Install the latest Docker CE binaries
RUN apt-get update && \
apt-get -y install apt-transport-https \
ca-certificates \
curl \
gnupg2 \
software-properties-common && \
curl -fsSL https://download.docker.com/linux/$(. /etc/os-release; echo "$ID")/gpg > /tmp/dkey; apt-key add /tmp/dkey && \
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") \
$(lsb_release -cs) \
stable" && \
apt-get update && \
apt-get -y install docker-ce \
&& curl -o /tmp/$FILENAME ${HELM_URL} \
&& tar -zxvf /tmp/${FILENAME} -C /tmp \
&& mv /tmp/linux-amd64/helm /bin/helm \
&& rm -rf /tmp/linux-amd64/helm \
&& curl -L https://storage.googleapis.com/kubernetes-release/release/${KUBE_LATEST_VERSION}/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl \
&& chmod +x /usr/local/bin/kubectl \
&& curl -L https://amazon-eks.s3-us-west-2.amazonaws.com/1.11.5/2018-12-06/bin/linux/amd64/aws-iam-authenticator -o /usr/local/bin/aws-iam-authenticator \
&& chmod +x /usr/local/bin/aws-iam-authenticator
Kubernetes fails inside jenkins pipeline this was my solution for jenkins installed locally on a windows machine.

Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock:

Use Case:
Base Instance has a Ubuntus 16.04
Installed Docker and it works find and I'm able to checkout docker images.
Deployed an Instance of Jenkins Docker container.
docker run -p 8080:8080 \
-v /var/run/docker.sock:/var/run/docker.sock \
--name jenkins \
jenkins/jenkins:lts
This Jenkins Instance will mount the host machine's Docker socket in the container. As Mentioned in the Article.
https://getintodevops.com/blog/the-simple-way-to-run-docker-in-docker-for-ci
Now installed he docker binaries on the Jenkins container.
apt-get update && \
apt-get -y install apt-transport-https \
ca-certificates \
curl \
gnupg2 \
software-properties-common && \
curl -fsSL https://download.docker.com/linux/$(. /etc/os-release; echo "$ID")/gpg > /tmp/dkey; apt-key add /tmp/dkey && \
add-apt-repository \ "deb [arch=amd64] https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") \
$(lsb_release -cs) \
stable" && \
apt-get update && \
apt-get -y install docker-c
Ran the Docker ps from the Jenkins conatiner and listed the available Containers.
But when triggering a job from Jenkins it fails withe the below error
+ docker run hello-world
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.39/containers/create: dial unix /var/run/docker.sock: connect: permission denied.
I tried solution provided to add the user to the group but it still fails
https://techoverflow.net/2017/03/01/solving-docker-permission-denied-while-trying-to-connect-to-the-docker-daemon-socket/
Any help is greatly appreciated.
Thank you
I got this to work.
ADD THE USER TO THE GROUP
STOP AND RESTART THE CONTAINER

Resources