How to access root folder inside a Docker container - docker

I am new to docker, and am attempting to build an image that involves performing an npm install. Some of our the dependencies are coming from private repos we have, and I am hitting an SSH related issue:
I realised I was not supplying any form of SSH details to my file, and came across various posts online about how to do this using args into the docker build command.
So taken from here, I have added the following to my dockerfile before the npm install command gets run:
ARG ssh_prv_key
ARG ssh_pub_key
RUN apt-get update && \
apt-get install -y \
git \
openssh-server \
libmysqlclient-dev
# Authorize SSH Host
RUN mkdir -p /root/.ssh && \
chmod 0700 /root/.ssh && \
ssh-keyscan github.com > /root/.ssh/known_hosts
# Add the keys and set permissions
RUN echo "$ssh_prv_key" > /root/.ssh/id_rsa && \
echo "$ssh_pub_key" > /root/.ssh/id_rsa.pub && \
chmod 600 /root/.ssh/id_rsa && \
chmod 600 /root/.ssh/id_rsa.pub
So running the docker build command again with the correct args supplied, I do see further activity in the console that suggests my SSH key is being utilised:
But as you can see I am getting no hostkey alg messages and
I still getting the same 'Host key verification failed' error. I was wondering if I could view the log file it references in the error:
Do I need to get the image running in order to be able to connect to it and browse the 'root' folder?
I hope I have made sense, please be gentle I am a docker noob!
Thanks

The lines that start with —-> in the docker build output are valid Docker image IDs. You can pick any of these and docker run them:
docker run --rm -it 59c45dac474a sh
If a step is actually failing, one useful debugging trick is to launch the image built in the step before it and run the command by hand.
Remember that anyone who has your image can do this; the way you’ve built it, if you ever push your image to any repository, your ssh private key is there for the taking, and you should probably consider it compromised. That’s doubly true since it will also be there in plain text in docker history output.

Related

Adding build tools to a Kaniko image for Gitlab-CI

Given a monorepo of ~35 services using a Gitlab-CI with k8s runners.
The images are built using Kaniko, utilizing <job>.extends of a prototype template, and life is great.
However, lately, we wanted to save a key on consul and change a gitlab-ci env-var after a successful build - which requires curl, and preferably jq.
I've been trying to create the following image to serve as image for image-building jobs:
FROM gcr.io/kaniko-project/executor:debug
RUN mkdir -p /workspace \
&& wget -qO /workspace/curl https://github.com/moparisthebest/static-curl/releases/download/v7.86.0/curl-amd64 \
&& chmod +x /workspace/curl \
&& wget -qO /workspace/jq https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64 \
&& chmod +x /workspace/jq
ENV PATH "$PATH:/workspace"
The build of which appears to succeed.
However - de-facto, when used in a pipeline job, given the following script:
.build-with-kaniko:
script:
- mkdir -p /kaniko/.docker;
echo "{\"auths\":{\"${CI_REGISTRY}\":{\"auth\":..... > /kaniko/.docker/config.json
- which jq || log no jq;
which curl || log no curl;
- >-
/kaniko/executor
--context $PROJECT_PATH
--dockerfile $DOCKERFILE
--destination ${CI_REGISTRY}/${DOCKER_REPO}:${TAG}
- which jq || log no jq;
which curl || log no curl;
Before running the executor - the curl and jq are found.
But after running the executor - they are gone!! <tam-tam-taaaaaaAAAMM!!!> :o
I tried placing them in few different folders: /busibox, /kaniko, /workspace or even a custom dir /misc- and could not get it to work...
I thought maybe it packs them to the target image - but no, they are not there.
I also noted that after building with --no-push they are still there
(but then I do not get my image on the registry...).
What is going on? is there a post-push cleanup mechanism I should instruct to leave these two files?
Help?
What must I do to help kaniko understand I need these two utilities?
OMG. :facepalm:
I knew I'll find the answer only after I post the question... :shrug:
Here's what worked:
Declare it as a new volume:
FROM gcr.io/kaniko-project/executor:debug
RUN mkdir -p /misc \
&& wget -qO /misc/curl https://github.com/moparisthebest/static-curl/releases/download/v7.86.0/curl-amd64 \
&& chmod +x /misc/curl \
&& wget -qO /misc/jq https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64 \
&& chmod +x /misc/jq
VOLUME /misc
ENV PATH "$PATH:/misc"
I got the clue from the current Dockerfile of the kaniko:debug image itself (at the time of this writing).
The image is recommended to be used as the base image for gitlab-ci jobs that use kaniko - and it includes /busybox.
I still don't understand why putting the tools in /busybox dir did not work, but I got a working solution now, and no time to dig deeper :sad: :shrug:

Docker file owners and groups

I think I have a dilemma. I am trying to create a Dockerfile to reproduce a long and complicated installation process (of ROS) so that my students can get it running with less headache.
I am combining various scripts provided with manual steps that are documented. The manual steps often say to do "sudo" but I am told that doing sudo inside a Dockerfile is to be avoided. So I move those steps to before the USER command in the Dockerfile because I am told that those commands run as root. However as a result the files and directories created are owned by root and I believe subsequent steps are failing.
I have two choices I think: move the commands to after the USER command and include sudo or try to make the install scripts create directories and files of the right ownership. Of course a priori I dont know what files and directories are going to be created.
Here is my Dockerfile (actually one of many I have been experimenting with.) Also if you see any other things that need to be improved or fixed please let me know!
FROM ubuntu:16.04
# create non-root user
ENV USERNAME ros
RUN adduser --ingroup sudo --disabled-password --gecos "" --shell /bin/bash --home /home/$USERNAME $USERNAME
RUN bash -c 'echo $USERNAME:ros | chpasswd'
ENV HOME /home/$USERNAME
RUN apt-get update && apt-get install --assume-yes wget sudo && \
wget https://raw.githubusercontent.com/ROBOTIS-GIT/robotis_tools/master/install_ros_kinetic.sh && \
chmod 755 ./install_ros_kinetic.sh && \
bash ./install_ros_kinetic.sh
RUN apt-get install --assume-yes ros-kinetic-joy ros-kinetic-teleop-twist-joy ros-kinetic-teleop-twist-keyboard ros-kinetic-laser-proc ros-kinetic-rgbd-launch ros-kinetic-depthimage-to-laserscan ros-kinetic-rosserial-arduino ros-kinetic-rosserial-python ros-kinetic-rosserial-server ros-kinetic-rosserial-client ros-kinetic-rosserial-msgs ros-kinetic-amcl ros-kinetic-map-server ros-kinetic-move-base ros-kinetic-urdf ros-kinetic-xacro ros-kinetic-compressed-image-transport ros-kinetic-rqt-image-view ros-kinetic-gmapping ros-kinetic-navigation ros-kinetic-interactive-markers
USER $USERNAME
WORKDIR /home/$USERNAME
RUN cd /home/$USERNAME/catkin_ws/src/ && \
git clone https://github.com/ROBOTIS-GIT/turtlebot3_msgs.git && \
git clone https://github.com/ROBOTIS-GIT/turtlebot3.git && \
git clone https://github.com/ROBOTIS-GIT/turtlebot3_simulations.git
# add catkin env
RUN echo 'source /opt/ros/kinetic/setup.bash' >> /home/$USERNAME/.bashrc
RUN echo 'source /home/ros/catkin_ws/devel/setup.bash' >> /home/$USERNAME/.bashrc
# RUN . /home/ros/.bashrc && \
# cd /home/$USERNAME/catkin_ws && \
# catkin_make
USER $USERNAME
ENTRYPOINT /bin/bash
Would be interesting for my own information to get why sudo should be avoided in containers.
Historically we use docker to automate build, test and deploy processes in our team and always tried to write Dockerfiles as close as possible to original process.
Lets say if you build in your host some app and launch some commands with sudo, some without, we managed to create exactly the same Dockerfiles. The positive feedback from this is that you are not obligated to write readme's on how to build the code anymore - you just supply Dockerfile and whenever someone wants to repeat all steps in non-container environment, he just follows (copy/pastes) commands from the file.
So my proposal is - in Dockerfile install packages first, then switch to user and proceed with all remaining steps, using sudo when necessary. You will have all artifacts owned by the user, not root.
UPD
Got the original discussion and this one. So it sounds like you choose the best approach based on your particular case and needs.

Docker proxy config not working for ADD in Dockerfile

I try to write a Dockerfile that adds a file to the image like this:
ADD https://repository.internal/file.zip /tmp/
The repository.internal host is only reachable through a proxy. I provide the proxy configuraton with the --config option but the ADD command seems not to use the proxy and fails.
I know the proxy configuration is correct because I added the line
RUN curl https://repository.internal/file.zip
which is working fine.
Is there any possibility to run the ADD command also with the proxy config?
As per my comments above, I believe this to be something to do with the internal way the Docker build process handles the ADD and RUN commands... I cant find documentation to back this up - so someone with greater internal knowledge may confirm or deny, but makes sense as a RUN command is done in a layer TO the image being built, where as the ADD command is performed and the results of it are baked into the image.
Whichever way this is being handled, you can achieve what you need by moving to the RUN method as follows:
FROM <your base image>
RUN curl https://repository.internal/file.zip >> /tmp/file.zip \
&& cd /tmp \
&& unzip file.zip \
&& rm file.zip
And you will have the files unzipped.
You may need to check if the rm at the end is required - cant remember off the top of my head if the unzip command removes the original zip file.
As you mentioned, this would rely on the curl and unzip packages being available on the image... however you could potentially avoid having these within your final application image by using Docker Multi Stage Builds
Your Dockerfile would then look something like:
FROM <some useful base image> as collector
RUN apt-get install -y curl unzip
RUN mkdir /tmp/files && \
&& curl https://repository.internal/file.zip >> /tmp/files/file.zip \
&& cd /tmp/files \
&& unzip file.zip \
&& rm file.zip
FROM <your final desired base image>
COPY --from=collector /tmp/files /tmp
This would then utilise an image to have curl and unzip in to collect and deal with the extraction of your files without having to install them on your final application image.

Passing Google service account credentials to Docker

My use case is a little different than others with this problem, so a little up-front description:
I am working on Google Cloud and have a "dockerized" Django app. Part of the app depends on using gsutil for moving files to/from a Google Storage bucket. For various reasons, we do not want to use Google Container Engine to manage our containers. Rather, we would like to scale horizontally by starting additional Google Compute VMs which will, in turn, run this Docker container. Similar to https://cloud.google.com/python/tutorials/bookshelf-on-compute-engine except we will use a container rather than pulling a git repository.
The VM's will be built from a basic Debian image, and the startup and installation of dependencies (e.g. Docker itself) will be orchestrated with a startup script (e.g. gcloud compute instances create some-instance --metadata-from-file startup-script=/path/to/startup.sh).
If I manually create a VM, elevate with sudo -s, run gsutil config -f (which creates a credential file at /root/.boto) and then run my docker container (see Dockerfile below) with
docker run -v /root/.boto:/root/.boto username/gs gsutil ls gs://my-test-bucket
then it works. However, that requires my interaction to create the boto file.
My question is: How can I pass the default service credentials to the Docker container that will be starting in that new VM?
gsutil works out of the box on even a "fresh" Debian VM since it is using the default compute engine credentials that all VMs are loaded with. Is there a way to use those credentials and pass them to the docker container? After the first call to gsutil on a fresh VM, I've noticed that it creates ~/.gsutil and ~/.config folders. Unfortunately, mounting both of those in Docker with
docker run -v ~/.config/:/root/.config -v ~/.gsutil:/root/.gsutil username/gs gsutil ls gs://my-test-bucket
does not fix my problem. It tells me:
ServiceException: 401 Anonymous users does not have storage.objects.list access to bucket my-test-bucket.
A minimal gsutil Dockerfile (not mine):
FROM alpine
#install deps and install gsutil
RUN apk add --update \
python \
py-pip \
py-cffi \
py-cryptography \
&& pip install --upgrade pip \
&& apk add --virtual build-deps \
gcc \
libffi-dev \
python-dev \
linux-headers \
musl-dev \
openssl-dev \
&& pip install gsutil \
&& apk del build-deps \
&& rm -rf /var/cache/apk/*
CMD ["gsutil"]
Addition: a workaround:
I have since solved my issue, but it is quite roundabout so I'm still interested in a simpler way, if possible. All the details are below:
First, a description:
I first created a service account in the web console. I then save the JSON keyfile (call it credentials.json) into a storage bucket. In the startup script for the GCE VM, I copy that keyfile to the local filesystem (gsutil cp gs://<bucket>/credentials.json /gs_credentials/). I then start my docker container, mounting that local directory. Then, as the docker container starts, it runs a script that authenticates the credentials.json (which creates a .boto file inside the docker), export BOTO_PATH=, and finally I can perform gsutil operations in the Docker container.
Here are the files for a small working example:
Dockerfile:
FROM alpine
#install deps and install gsutil
RUN apk add --update \
python \
py-pip \
py-cffi \
py-cryptography \
bash \
curl \
&& pip install --upgrade pip \
&& apk add --virtual build-deps \
gcc \
libffi-dev \
python-dev \
linux-headers \
musl-dev \
openssl-dev \
&& pip install gsutil \
&& apk del build-deps \
&& rm -rf /var/cache/apk/*
# install the gcloud SDK-
# this allows us to use gcloud auth inside the container
RUN curl -sSL https://sdk.cloud.google.com > /tmp/gcl \
&& bash /tmp/gcl --install-dir=~/gcloud --disable-prompts
RUN mkdir /startup
ADD gsutil_docker_startup.sh /startup/gsutil_docker_startup.sh
ADD get_account_name.py /startup/get_account_name.py
ENTRYPOINT ["/startup/gsutil_docker_startup.sh"]
gsutil_docker_startup.sh: Takes a single argument, which is the path to a JSON-format service account credentials file. The file exists because the directory on the host machine was mounted in the container.
#!/bin/bash
CRED_FILE_PATH=$1
mkdir /results
# List the bucket, see that it gives a "ServiceException:401"
gsutil ls gs://<input bucket> > /results/before.txt
# authenticate the credentials- this creates a .boto file:
/root/gcloud/google-cloud-sdk/bin/gcloud auth activate-service-account --key-file=$CRED_FILE_PATH
# need to extract the service account which is like:
# <service acct ID>#<google project>.iam.gserviceaccount.com"
SERVICE_ACCOUNT=$(python /startup/get_account_name.py $CRED_FILE_PATH)
# with that service account, we can locate the .boto file:
export BOTO_PATH=/root/.config/gcloud/legacy_credentials/$SERVICE_ACCOUNT/.boto
# List the bucket and copy the file to an output bucket for good measure
gsutil ls gs://<input bucket> > /results/after.txt
gsutil cp /results/*.txt gs://<output bucket>/
get_account_name.py:
import json
import sys
j = json.load(open(sys.argv[1]))
sys.stdout.write(j['client_email'])
Then, the GCE startup script (executed automatically as the VM is started) is:
#!/bin/bash
# <SNIP>
# Install docker, other dependencies
# </SNIP>
# pull docker image
docker pull userName/containerName
# get credential file:
mkdir /cloud_credentials
gsutil cp gs://<bucket>/credentials.json /cloud_credentials/creds.json
# run container
# mount the host machine directory where the credentials were saved.
# Note that the container expects a single arg,
# which is the path to the credential file IN THE CONTAINER
docker run -v /cloud_credentials:/cloud_credentials \
userName/containerName /cloud_credentials/creds.json
You can assign a Specific Service Account to your instance and then use the Application Default Credential in your code. Please verify these points before testing.
Set the instance access scopes to: "Allow full access to all Cloud APIs" as they are not really a security feature
Set the right role to you service account: "Storage Object Viewer"
Authentication Token are retrieved automatically by Application Default Credential via Google Metadata Server which is available from your instance and your Docker containers as well. There is no need to manage any credentials.
def implicit():
from google.cloud import storage
# If you don't specify credentials when constructing the client, the
# client library will look for credentials in the environment.
storage_client = storage.Client()
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
I also quickly tested with docker and I worked perfectly
yann#test:~$ gsutil cat gs://my-test-bucket/hw.txt
Hello World
yann#test:~$ docker run --rm google/cloud-sdk gsutil cat gs://my-test-bucket/hw.txt
Hello World

How to add a file to an image in Dockerfile without using the ADD or COPY directive

I need the contents of a large *.zip file (5 gb) in my Docker container in order to compile a program. The *.zip file resides on my local machine. The strategy for this would be:
COPY program.zip /tmp/
RUN cd /tmp \
&& unzip program.zip \
&& make
After having done this I would like to remove the unzipped directory and the original *.zip file because they are not needed any more. The problem is that the COPY (and also the ADD directive) will add a layer to the image that will contain the file program.zip which is problematic as may image will be at least 5gb big. Is there a way to add a file to a container without using COPY or ADD directive? wget will not work as the mentioned *.zip file is on my local machine and curl file://localhost/home/user/program.zip -o /tmp/program.zip will not work either.
It is not straightforward but it can be done via wget or curl with a little support from python. (All three tools should usually be available on a *nix system.)
wget will not work when no url is given and
curl file://localhost/home/user/program.zip -o /tmp/
will not work from within a Dockerfile's RUN instruction. Hence, we will need a server which wget and curl can access and download program.zip from.
To do this we set up a little python server which serves our http requests. We will be using the http.server module from python for this. (You can use python or python 3. It will work with both.).
python -m http.server --bind 192.168.178.20 8000
The python server will serve all files in the directory it is started in. So you should make sure that you start your server either in the directory the file you want to download during your image build resides in or create a temporary directory which contains your program. For illustration purposes let's create the file foo.txt which we will later download via wget in our Dockerfile:
echo "foo bar" > foo.txt
When starting the http server, it is important, that we specify the IP address of our local machine on the LAN. Furthermore, we will open Port 8000. Having done this we should see the following output:
python3 -m http.server --bind 192.168.178.20 8000
Serving HTTP on 192.168.178.20 port 8000 ...
Now we build a Dockerfile to illustrate how this works. (We will assume that the file foo.txt should be downloaded into /tmp):
FROM debian:latest
RUN apt-get update -qq \
&& apt-get install -y wget
RUN cd /tmp \
&& wget http://192.168.178.20:8000/foo.txt
Now we start the build with
docker build -t test .
During the build you will see the following output on our python server:
172.17.0.21 - - [01/Nov/2014 23:32:37] "GET /foo.txt HTTP/1.1" 200 -
and the build output of our image will be:
Step 2 : RUN cd /tmp && wget http://192.168.178.20:8000/foo.txt
---> Running in 49c10e0057d5
--2014-11-01 22:56:15-- http://192.168.178.20:8000/foo.txt
Connecting to 192.168.178.20:8000... connected.
HTTP request sent, awaiting response... 200 OK
Length: 25872 (25K) [text/plain]
Saving to: `foo.txt'
0K .......... .......... ..... 100% 129M=0s
2014-11-01 22:56:15 (129 MB/s) - `foo.txt' saved [25872/25872]
---> 5228517c8641
Removing intermediate container 49c10e0057d5
Successfully built 5228517c8641
You can then check if it really worked by starting and entering a container from the image you just build:
docker run -i -t --rm test bash
You can then look in /tmp for foo.txt.
We can now add any file to our image without creating an new layer. Assuming you want to add a program of about 5 gb as mentioned in the question we could do:
FROM debian:latest
RUN apt-get update -qq \
&& apt-get install -y wget
RUN cd /tmp \
&& wget http://conventiont:8000/program.zip \
&& unzip program.zip \
&& cd program \
&& make \
&& make install \
&& cd /tmp \
&& rm -f program.zip \
&& rm -rf program
In this way we will not be left with 10 gb of cruft.
There's no way to do this. A feature request is here https://github.com/docker/docker/issues/3156.
Can you not map a local folder to the container when launched and then copy the files you need.
sudo docker run -d -P --name myContainerName -v /localpath/zip_extract:/container/path/ yourContainerID
https://docs.docker.com/userguide/dockervolumes/
I have posted a similar answer here: https://stackoverflow.com/a/37542913/909579
You can use docker-squash to squash newly created layers. That will essentially remove the archive from final image if you remove it in subsequent RUN instruction.

Resources