Docker multi-stage with AWS CLI - docker

I am setting up a multistage build in Docker where I need to pull some data from a remote image. In that remote image, I see they installed the AWS CLI using the following set of commands in order to get it into an Alpine-based image:
RUN apk --no-cache add python3 && \
pip3 install awscli && \
aws --version
The copy say it's just fine
COPY --from=remote_setup /usr/bin/terraform /usr/bin/terraform
COPY --from=remove_setup /usr/bin/aws /usr/bin/aws
Terraform here runs peachy, but AWS does not. The output looks like this:
/ # terraform -v
Terraform v0.12.2
/ # ls -lh /usr/bin | grep aws
-rwxr-xr-x 1 root root 817 Jun 19 19:51 aws
/ # aws --version
/bin/sh: aws: not found
If I add python3, I then get this error:
/ # aws --version
Traceback (most recent call last):
File "/usr/bin/aws", line 19, in <module>
import awscli.clidriver
ModuleNotFoundError: No module named 'awscli'
Is there a trick to copying over all the data from a command in that particular layer to my new one or for simplicity's sake should I just install Python and the AWS CLI myself in my image?
Thanks!

pip is the standard Python package manager. In addition to installing a wrapper script in /usr/bin (or the current environment's bin directory) it also installs a fair bit of library code under a .../lib/pythonX.Y/site-packages/... tree. Also, packages are allowed to depend on other packages, so it's not going to just be a single directory in the site-packages directory.
In short: you will need the Python interpreter and everything the pip install installs, so you should run that command yourself in your derived image.

Related

renaming a file with Dockerfile instructions

I'm trying to build a docker which clones a public repository, builds a library and the built library is then used by the main application. My local machine is on MacOS, the docker is a Linux distro, so I just can't compile and move the file. The library needs to be renamed (mandatory, the output is .dylib, but to use it in python it must become .so) and moved (optional).
ADD and COPY take my local machine as reference for the source, the relevant part of the Dockerfile is:
RUN git clone https://gitlab.com/somelib/somelib.git
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
RUN cargo build --release --manifest-path=somelib/Cargo.toml
RUN cp somelib/target/release/libsomelib.dylib app/main/util/somelib.so #<-- ERROR HERE
But this doesn't work because it fails to find libsomelib.dylib
cp: cannot stat 'somelib/target/release/libsomelib.dylib': No such file or directory
Is this possible or is docker not meant for this this operation?

Run protoc command into docker container

I'm trying to run protoc command into a docker container.
I've tried using the gRPC image but protoc command is not found:
/bin/sh: 1: protoc: not found
So I assume I have to install manually using RUN instructions, but is there a better solution? An official precompiled image with protoc installed?
Also, I've tried to install via Dockerfile but I'm getting again protoc: not found.
This is my Dockerfile
#I'm not using "FROM grpc/node" because that image can't unzip
FROM node:12
...
# Download proto zip
ENV PROTOC_ZIP=protoc-3.14.0-linux-x86_32.zip
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.14.0/${PROTOC_ZIP}
RUN unzip -o ${PROTOC_ZIP} -d ./proto
RUN chmod 755 -R ./proto/bin
ENV BASE=/usr/local
# Copy into path
RUN cp ./proto/bin/protoc ${BASE}/bin
RUN cp -R ./proto/include/* ${BASE}/include
RUN protoc -I=...
I've done RUN echo $PATH to ensure the folder is in path and is ok:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Also RUN ls -la /usr/local/bin to check protoc file is into the folder and it shows:
-rwxr-xr-x 1 root root 4849692 Jan 2 11:16 protoc
So the file is in /bin folder and the folder is in the path.
Have I missed something?
Also, is there a simple way to get the image with protoc installed? or the best option is generate my own image and pull from my repository?
Thanks in advance.
Edit: Solved downloading linux-x86_64 zip file instead of x86_32. I downloaded the lower architecture requirements thinking a x86_64 machine can run a x86_32 file but not in the other way. I don't know if I'm missing something about architecture requirements (It's probably) or is a bug.
Anyway in case it helps someone I found the solution and I've added an answer with the neccessary Dockerfile to run protoc and protoc-gen-grpc-web.
The easiest way to get non-default tools like this is to install them through the underlying Linux distribution's package manager.
First, look at the Docker Hub page for the node image. (For "library" images like node, construct the URL https://hub.docker.com/_/node.) You'll notice there that there are several variations named "alpine", "buster", or "stretch"; plain node:12 is the same as node:12-stretch and node:12.20.0-stretch. The "alpine" images are based on Alpine Linux; the "buster" and "stretch" ones are different versions of Debian GNU/Linux.
For Debian-based packages, you can then look up the package on https://packages.debian.org/ (type protoc into the "Search the contents of packages" form at the bottom of the page). That leads you to the protobuf-compiler package. Knowing that contains the protoc binary, you can install it in your Dockerfile with:
FROM node:12 # Debian-based
RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive \
apt-get install --no-install-recommends --assume-yes \
protobuf-compiler
# The rest of your Dockerfile as above
COPY ...
RUN protoc ...
You generally must run apt-get update and apt-get install in the same RUN command, lest a subsequent rebuild get an old version of the package cache from the Docker build cache. I generally have only a single apt-get install command if I can manage it, with the packages list alphabetically one to a line for maintainability.
If the image is Alpine-based, you can do a similar search on https://pkgs.alpinelinux.org/contents to find protoc, and similarly install it:
FROM node:12-alpine
RUN apk add --no-cache protoc
# The rest of your Dockerfile as above
Finally I solved my own issue.
The problem was the arch version: I was using linux-x86_32.zip but works using linux-x86_64.zip
Even #David Maze answer is incredible and so complete, it didn't solve my problem because using apt-get install version 3.0.0 and I wanted 3.14.0.
So, the Dockerfile I have used to run protoc into a docker container is like this:
FROM node:12
...
# Download proto zip
ENV PROTOC_ZIP=protoc-3.14.0-linux-x86_64.zip
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.14.0/${PROTOC_ZIP}
RUN unzip -o ${PROTOC_ZIP} -d ./proto
RUN chmod 755 -R ./proto/bin
ENV BASE=/usr
# Copy into path
RUN cp ./proto/bin/protoc ${BASE}/bin/
RUN cp -R ./proto/include/* ${BASE}/include/
# Download protoc-gen-grpc-web
ENV GRPC_WEB=protoc-gen-grpc-web-1.2.1-linux-x86_64
ENV GRPC_WEB_PATH=/usr/bin/protoc-gen-grpc-web
RUN curl -OL https://github.com/grpc/grpc-web/releases/download/1.2.1/${GRPC_WEB}
# Copy into path
RUN mv ${GRPC_WEB} ${GRPC_WEB_PATH}
RUN chmod +x ${GRPC_WEB_PATH}
RUN protoc -I=...
Because this is currently the highest ranked result on Google and the above instructions above won't work, if you want to use docker/dind for e.g. gitlab, this is the way how you can get the glibc-dependency working for protoc there:
#!/bin/bash
# install gcompat, because protoc needs a real glibc or compatible layer
apk add gcompat
# install a recent protoc (use a version that fits your needs)
export PB_REL="https://github.com/protocolbuffers/protobuf/releases"
curl -LO $PB_REL/download/v3.20.0/protoc-3.20.0-linux-x86_64.zip
unzip protoc-3.20.0-linux-x86_64.zip -d $HOME/.local
export PATH="$PATH:$HOME/.local/bin"

"No module named PIL" after "RUN pip3 install Pillow" in docker container; neither PIL nor Pillow present in dist-packages directory

I'm following this SageMaker guide and using the 1.12 cpu docker file.
https://github.com/aws/sagemaker-tensorflow-serving-container
If I use the requirements.txt file to install Pillow, my container works great locally, but when I deploy to SageMaker, 'pip3 install' fails with an error indicating my container doesn't have internet access.
To work around that issue, I'm trying to install Pillow in my container before deploying to SageMaker.
When I include the lines "RUN pip3 install Pillow" and "RUN pip3 show Pillow" in my docker file, when building, I see output saying "Successfully installed Pillow-6.2.0" and the show command indicates the lib was installed at /usr/local/lib/python3.5/dist-packages. Also running "RUN ls /usr/local/lib/python3.5/dist-packages" in the docker files shows "PIL" and "Pillow-6.2.0.dist-info" in dist-packages, and the PIL directory includes many code files.
However, when I run my container locally, trying to import in python using "from PIL import Image" results in error "No module named PIL". I've tried variations like "import Image", but PIL doesn't seem to be installed in the context in which the code is running when I start the container.
Before the line "from PIL import Image", I added "import subprocess" and 'print(subprocess.check_output("ls /usr/local/lib/python3.5/dist-packages".split()))'
This ls output matches what I get when running it in the docker file, except "PIL" and "Pillow-6.2.0.dist-info" are missing. Why are those two in /usr/local/lib/python3.5/dist-packages when I run the docker file but not when my container is started locally?
Is there a better way to include Pillow in my container? The referenced Github page also shows that I can deploy libraries by including the files (in code/lib of model package), but to get files compatible with Ubuntu 16.04 (which the docker container uses; I'm on a Mac), I'd probably copy them from the docker container after running "RUN pip3 install Pillow" in my docker file, and it seems odd that I would need to get files from the docker container to deploy to the docker container.
My docker file looks like:
ARG TFS_VERSION
FROM tensorflow/serving:${TFS_VERSION} as tfs
FROM ubuntu:16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
COPY --from=tfs /usr/bin/tensorflow_model_server /usr/bin/tensorflow_model_server
# nginx + njs
RUN \
apt-get update && \
apt-get -y install --no-install-recommends curl && \
curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - && \
echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list && \
apt-get update && \
apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools && \
apt-get clean
RUN pip3 install Pillow
# cython, falcon, gunicorn, tensorflow-serving
RUN \
pip3 install --no-cache-dir cython falcon gunicorn gevent requests grpcio protobuf tensorflow && \
pip3 install --no-dependencies --no-cache-dir tensorflow-serving-api
COPY ./ /
ARG TFS_SHORT_VERSION
ENV SAGEMAKER_TFS_VERSION "${TFS_SHORT_VERSION}"
ENV PATH "$PATH:/sagemaker"
RUN pip3 show Pillow
RUN ls /usr/local/lib/python3.5/dist-packages
I've tried installing Pillow on the same line as cython and other dependencies, but the result is the same...those dependencies are in /usr/local/lib/python3.5/dist-packages both at the time the container is built and when the container is started locally, while "PIL" and "Pillow-6.2.0.dist-info" are only present when the container is built.
Apologies for the late response.
If I use the requirements.txt file to install Pillow, my container works great locally, but when I deploy to SageMaker, 'pip3 install' fails with an error indicating my container doesn't have internet access.
If restricted internet access isn't a requirement, then you should be able to enable internet access by making enable_network_isolation=False when instantiating your Model class in the SageMaker Python SDK, as shown here: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/model.py#L85
If restricted internet access is a requirement, this means that you will need to either install your dependencies in your own container beforehand or make use of the packaging as you mentioned in your correspondence.
I have copied your provided Dockerfile and created an image to run as an image in order to reproduce the error you are seeing. I was not able to reproduce the error as quoted below:
However, when I run my container locally, trying to import in python using "from PIL import Image" results in error "No module named PIL". I've tried variations like "import Image", but PIL doesn't seem to be installed in the context in which the code is running when I start the container.
I created a similar Docker image and ran it as a container with the following command:
docker run -it --entrypoint bash <DOCKER_IMAGE>
from within the container I started a Python3 session and ran the following commands locally without error:
root#13eab4c6e8ab:/# python3 -s
Python 3.5.2 (default, Oct 8 2019, 13:06:37)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from PIL import Image
Can you please provide the code for how you're starting your SageMaker jobs?
Please double check that the Docker image you have created is the one being referenced when starting your SageMaker jobs.
Please let me know if there is anything I can clarify.
Thanks!

Install Docker image from RHEL 7.3 .rpm file

I want to install Docker image on RHEL 7.3 using .rpm files.
I have got access to Docker .rpm files, but there are list of files in the stable package.
Could any one let me know which .rpm file should I use for the installation.
You can search a ready-made RHEL docker image from dockerhub portal; and pull it as docker pull <image-name>.
Alternatively, you can build your own RHEL as below.
Download binary mkimage-yum.sh from https://github.com/docker/docker/blob/master/contrib/mkimage-yum.sh
Modify the mkimage-yum.sh to create a rhel 7 minimal tarfile, comment out the very last two/three lines (as below) and add a new line as follows:
#tar –numeric-owner -c -C "$target" . | docker import - $name:$version
#docker run -i -t $name:$version echo success
tar --numeric-owner -c -C "$target" . -zf ${name}.tar.gz
Run the script as ./mkimage-yum.sh rhel7_docker.
Build a docker image out of the tar file as cat rhel7_docker.tar.gz | sudo docker import - rhel7
The last argument `rhel7` is the image name that are gonna generate.
To install docker from a .rpm file, you need to download the rpm file(s) and install using YUM. I did it for Docker Enterprise edition. I downloaded the rpm files from the stable package from docker storebits to my local folder. We need both selinux and docker-ee rpm files. Then point the YUM install directory to the download folder.
Note Selinux rpm to be install first and followed by docker-ee rpm
yum install "path to the rpm files"

Gitlab.com runners: How do I install and run software from an external repos?

I'm pretty new to Gitlab.com's CI and to docker.
I have a simple python pelican static blog that builds with a simple .gitlab-ci.yml
image: python:2.7-alpine
pages:
script:
- pip install -r requirements.txt
- pelican -s publishconf.py
artifacts:
paths:
- public
So I see that it specifies a python docker image, uses pip to install various python scripts, then runs pelican all within that image.
Now my issue is that I want to run a my own version of pelican. I modified my requirements.txt file to look for my own branch of pelican, but this fails
beautifulsoup4
markdown
smartypants
typogrify
git+https://github.com/jerryasher/pelican.git#hidden-cats
pelican-fontawesome
pelican-gist
pelican-jsfiddle
pelican-neighbors
Now when it builds, Gitlab's Runner tells me:
Running with gitlab-ci-multi-runner 1.9.0 (82714ae)
Using Docker executor with image python:2.7-alpine ...
Pulling docker image python:2.7-alpine ...
Running on runner-e11ae361-project-1654117-concurrent-0 via runner-e11ae361-machine-1484613050-ce975c76-digital-ocean-4gb...
Cloning repository...
Cloning into '/builds/jerrya/ashercodes'...
Checking out 532f8b38 as master...
$ pip install -r requirements.txt
Collecting git+https://github.com/jerryasher/pelican.git#hidden-cats (from -r requirements.txt (line 5))
Cloning https://github.com/jerryasher/pelican.git (to hidden-cats) to /tmp/pip-72xxqt-build
Error [Errno 2] No such file or directory while executing command git clone -q https://github.com/jerryasher/pelican.git /tmp/pip-72xxqt-build
Cannot find command 'git'
ERROR: Build failed: exit code 1
Okay,
Git doesn't seem to be present. Indeed prior to the above attempt, I had added a line (that failed) to the .gitlab-ci.yml script saying to use git to clone that repo locally, and that also failed, because ... no git.
(The docker image I am using python:2.7-alpine also seems to have no apt-get.)
Do I need to build my own docker image containing git and python and anything else that I require, or is there some "usual" way to have a Gitlab.com runner pull in an external program from either a git repo, or some typical linux package repository?
And if I can't do this, is that in this case the fault of the runner, or the fault of the docker image?
You can just install git (and any other package) if you need it. Your own image will be faster but it's not needed.
pages:
script:
- apk --update add git openssh
- pip install -r requirements.txt
...

Resources