docker ERROR: "/${DIR_NAME}" not found: not found - How to avoid huge images - docker

I'm running into a problem with docker where I'm using some variables in the dockerfile. The variables work everywhere except in the source= argument of a bind mount.
I'm installing a large (40+ GB) software package. I have three different versions to work with so I made a variable to identify the version directory. I'm using a bind mount to avoid sending the installation source to the image.
What is the best way to debug? --progress=plain and --no-cache don't really show much more on the error message.
Is there a better way to do this whole thing? I don't want to create 150 GB images if I can avoid that. If I have all three versions of software in one directory that could be a long build and a big image since I need to select the version at build time. I guess I may need a more involved build process than allowed here--what would that look like?
How do I get around this variable issue?
The command that causes the problem is here:
ARG SOFTWARE_DIR=MySoftwareDirectory #this lives in the same directory as the *.docker
ARG SOFTWARE_VER=2022.3
FROM centos:latest AS base
RUN yum update -y \
&& yum upgrade -y \
&& yum install -y libXext libXrender libXtst \
&& yum clean all
RUN mkdir /tools
FROM base as final
ARG SOFTWARE_VER
ARG SOFTWARE_DIR
COPY ./install_config_${SOFTWARE_VER}.txt /tools/install_config.txt
RUN --mount=type=bind,target=/mnt/${SOFTWARE_DIR},readonly,source=${SOFTWARE_DIR} \
/mnt/${SOFTWARE_DIR}/setup --batch
ENTRYPOINT ["/tools/software/${SOFTWARE_VER}/bin/software"]
The error is:
ERROR: "/${SOFTWARE_DIR}" not found: not found
...
failed to computer cache key: "/${SOFTWARE_DIR}" not found: not found
This works:
ARG SOFTWARE_DIR=MySoftwareDirectory #this lives in the same directory as the *.docker
ARG SOFTWARE_VER=2022.3
FROM centos:latest AS base
RUN yum update -y \
&& yum upgrade -y \
&& yum install -y libXext libXrender libXtst \
&& yum clean all
RUN mkdir /tools
FROM base as final
ARG SOFTWARE_VER
ARG SOFTWARE_DIR
COPY ./install_config_${SOFTWARE_VER}.txt /tools/install_config.txt
RUN --mount=type=bind,target=/mnt/${SOFTWARE_DIR},readonly,source=MySoftwareDirectory \
/mnt/${SOFTWARE_DIR}/setup --batch
ENTRYPOINT ["/tools/software/${SOFTWARE_VER}/bin/software"]
I am building with:
DOCKER_BUILDKIT=1 docker build -t software -f software.docker .
My directory structure looks like this:
software.docker
MySoftwareDirectory
I did chmod 777 on MySoftwareDirectory to correct permissions issue. (I know not the best way)
Docker version is 20.10.21. Host is CentOS 7 fully patched.

Related

issue in creating docker image from docker file

Created a Docker file in oreder to install Tomcat server from Unix as bashe os
My Dockerfile:
FROM ubuntu
RUN apt-get update && apt-get upgrade -y #to update os
RUN apt-get dist-upgrade
RUN apt-get install build-essential
RUN apt-get install openjdk-8-jdk # to install java 8
RUN apt-get wget -y #to install wget package
RUN apt-get wget https://mirrors.estointernet.in/apache/tomcat/tomcat-9/v9.0.37/bin/apache-tomcat-9.0.37.tar.gz #to download tomcat
RUN tar -xvzf apache-tomcat-9.0.37 # unzipping the tomcat
RUN mkdir tomcat # craeting tomacat directory
RUN cp apache-tomcat-9.0.37/* tomcat # copying tomact files to tomact directory
Command to create Docker Image from Docker file:
docker build -t [img name] -f [file name] .
On execution, while installing java package am getting like this:
'''After this operation, 242 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y'''
You are getting the prompt because the command is awaiting user input for whether or not to install a package. The -y flag you're using for a few of them (like wget) allows bash to assume a yes. Add this flag to all your installation commands.
By the way, there's quite a few potential issues with the Dockerfile you posted.
For example, you have RUN apt-get wget ...
Are you sure that is what you want to do, and not just RUN wget ...? Unless wget is a command that apt-get takes, which it isn't, it will cause unexpected behavior.
You also seem to be missing the command to start the Tomcat server, which can make it so that nothing happens when you attempt to run the image.
I think you should add DEBIAN_FRONTEND=noninteractive when running the apt-get commands, something like this:
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install build-essential -y
Also, it's considered bad practice to use multiple RUN steps which could be consolidated into one. More about Dockerfile best practices can be found here.

Docker multistage build vs. keeping artifacts in git

My target container is a build environment container, so my team would build an app in a uniform environment.
This app doesn't necessarily run as a container - it runs on physical machine. The container is solely for building.
The app depends on third parties.
Some I can apt-get install with Dockerfile RUN command.
And some I must build myself because they require special building.
I was wondering which way is better.
Using multistage build seems cool; Dockerfile for example:
From ubuntu:18.04 as third_party
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
...
ADD http://.../boost.tar.gz /
RUN tar boost.tar.gz && \
... && \
make --prefix /boost_out ...
From ubuntu:18.04 as final
COPY --from=third_party /boost_out/ /usr/
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
...
CMD ["bash"]
...
Pros:
Automatically built when I build my final container
Easy to change third party version (boost in this example)
Cons
ADD command downloads ~100MB file each time, makes image build process slower
I want to use --cache-from so I would be able to cache third_party and build from different docker host machine. Meaning I need to store ~1.6GB image in a docker registry. That's pretty heavy to pull/push.
On the other hand
I could just build boost (with this third_party image) and store its artifacts on some storage, git for example. It would take ~200MB which is better than storing 1.6GB image.
Pros:
Smaller disc space
Cons:
Cumbersome build
Manually build and push artifacts to git when changing boost version.
Somehow link Docker build and git to pull newest artifacts and COPY to the final image.
In both ways I need a third_party image that uniformly and automatically builds third parties. In 1. the image bigger than 2. that will contain just build tools, and not build artifacts.
Is this the trade-off?
1. is more automatic but consumes more disk space and push/pull time,
2. is cumbersome but consumes less disk space and push/pull time?
Are there any other virtues for any of these ways?
I'd like to propose changing your first attempt to something like this:
FROM ubuntu:18.04 as third_party
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
...
RUN wget http://.../boost.tar.gz -O /boost.tar.gz && \
tar xvf boost.tar.gz && \
... && \
make --prefix /boost_out ... && \
find -name \*.o -delete && \
rm /boost.tar.gz # this is important!
From ubuntu:18.04 as final
COPY --from=third_party /boost_out/ /usr/
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
...
CMD ["bash"]
This way, you are paying for the download of boost only once (when building the image without a cache), and you do not pay for the storage/pull-time of the original tar-ed sources. Additionally, you should remove unneeded target files (.o?) from the build in the same step in which they are generated. Otherwise, they are stored and pulled as well.
If you are at liberty posting the whole Dockerfile, I'll gladly take a deeper look at it and give you some hints.

Up to date (lts-13.0) minimal base docker image for haskell/stack?

I'm would like to deploy my haskell application on docker and the base image fco/stack-build that I've found takes 9GB ! Do you know a base image more minimal than that ??
stack-build is as large as it is, because it contains the required system dependencies of all packages on Stackage.
I am using the following base image for building and deploying:
FROM ubuntu:18.04
RUN apt-get update
# Build dependencies
RUN apt-get install --assume-yes curl
RUN curl -sSL https://get.haskellstack.org/ | sh
RUN apt-get install --assume-yes libtinfo-dev
# Without this haddock crashes for modules containing
# non-ASCII characters.
ENV LANG C.UTF-8
It's not really minimal if you just want to use the image during runtime as you wouldn't need stack in that case.
Firstly, that image might only be necessary to build the excutable, once you have it built you could use a multistage docker build or just copy the executable directly to a more slim image.
The dockerfile is available here: https://github.com/commercialhaskell/stack/blob/master/etc/dockerfiles/stack-build/lts-13.0/Dockerfile
You could remove these commands (which probably add up to the bulk of the size):
# Use Stackage's debian-bootstrap.sh script to install system libraries and
# tools required to build any Stackage package.
#
RUN apt-get update && \
apt-get install -y wget && \
wget -qO- https://raw.githubusercontent.com/fpco/stackage/$BOOTSTRAP_COMMIT/debian-bootstrap.sh | bash && \
rm -rf /var/lib/apt/lists/*

Dockerfile - Hide --build-args from showing up in the build time

I have the following Dockerfile:
FROM ubuntu:16.04
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y \
git \
make \
python-pip \
python2.7 \
python2.7-dev \
ssh \
&& apt-get autoremove \
&& apt-get clean
ARG password
ARG username
ENV password $password
ENV username $username
RUN pip install git+http://$username:$password#org.bitbucket.com/scm/do/repo.git
I use the following commands to build the image from this Dockerfile:
docker build -t myimage:v1 --build-arg password="somepassoword" --build-arg username="someuser" .
However, in the build log the username and password that I pass as --build-arg are visible.
Step 8/8 : RUN pip install git+http://$username:$password#org.bitbucket.com/scm/do/repo.git
---> Running in 650d9423b549
Collecting git+http://someuser:somepassword#org.bitbucket.com/scm/do/repo.git
How to hide them? Or is there a different way of passing the credentials in the Dockerfile?
Update
You know, I was focusing on the wrong part of your question. You shouldn't be using a username and password at all. You should be using access keys, which permit read-only access to private repositories.
Once you've created an ssh key and added the public component to your repository, you can then drop the private key into your image:
RUN mkdir -m 700 -p /root/.ssh
COPY my_access_key /root/.ssh/id_rsa
RUN chmod 700 /root/.ssh/id_rsa
And now you can use that key when installing your Python project:
RUN pip install git+ssh://git#bitbucket.org/you/yourproject.repo
(Original answer follows)
You would generally not bake credentials into an image like this. In addition to the problem you've already discovered, it makes your image less useful because you would need to rebuild it every time your credentials changed, or if more than one person wanted to be able to use it.
Credentials are more generally provided at runtime via one of various mechanisms:
Environment variables: you can place your credentials in a file, e.g.:
USERNAME=myname
PASSWORD=secret
And then include that on the docker run command line:
docker run --env-file myenvfile.env ...
The USERNAME and PASSWORD environment variables will be available to processes in your container.
Bind mounts: you can place your credentials in a file, and then expose that file inside your container as a bind mount using the -v option to docker run:
docker run -v /path/to/myfile:/path/inside/container ...
This would expose the file as /path/inside/container inside your container.
Docker secrets: If you're running Docker in swarm mode, you can expose your credentials as docker secrets.
It's worse than that: they're in docker history in perpetuity.
I've done two things here in the past that work:
You can configure pip to use local packages, or to download dependencies ahead of time into "wheel" files. Outside of Docker you can download the package from the private repository, giving the credentials there, and then you can COPY in the resulting .whl file.
pip install wheel
pip wheel --wheel-dir ./wheels git+http://$username:$password#org.bitbucket.com/scm/do/repo.git
docker build .
COPY ./wheels/ ./wheels/
RUN pip install wheels/*.whl
The second is to use a multi-stage Dockerfile where the first stage does all of the installation, and the second doesn't need the credentials. This might look something like
FROM ubuntu:16.04 AS build
RUN apt-get update && ...
...
RUN pip install git+http://$username:$password#org.bitbucket.com/scm/do/repo.git
FROM ubuntu:16.04
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install \
python2.7
COPY --from=build /usr/lib/python2.7/site-packages/ /usr/lib/python2.7/site-packages/
COPY ...
CMD ["./app.py"]
It's worth double-checking in the second case that nothing has gotten leaked into your final image, because the ARG values are still available to the second stage.
For me, I created a bash file call set-up-cred.sh.
Inside set-up-cred.sh
echo $CRED > cred.txt;
Then, in Dockerfile,
RUN bash set-up-cred.sh;
...
RUN rm cred.txt;
This is for hiding echoing credential variables.

docker-compose update from S3 bucket

Our Dockerfile invokes a python script which copies a binary from S3 to /usr/bin. This works fine the first time. But from then on "docker-compose build" does nothing because everything is cached. This is a problem if the binary has changed.
Short of building with --no-cache, what is the best way to make sure "docker-compose build" will always pick up the new binary if there is one. We don't mind if it unnecessarily downloads the binary even if unchanged, so long as it does work then the binary has changed.
Seems like we want a Dockerfile step that always executes?
FROM ubuntu:trusty
RUN apt-get update
RUN apt-get -y install software-properties-common
RUN apt-get -y install --reinstall ca-certificates
RUN add-apt-repository ppa:fkrull/deadsnakes
RUN apt-get update && apt-get install -y \
curl \
wget \
vim \
git \
python3.5 \
python3-pip \
python3-setuptools \
libpcap0.8-dev
RUN ln -sf /usr/bin/python3.5 /usr/bin/python3
ADD . /app
WORKDIR /app
# Install Python Requirements
RUN pip3 install -r etc/python/requirements.txt
# Download/Install processor and associated libs
RUN python3 setup_processor.py
RUN mkdir -p /logs
ENTRYPOINT ["/app/entrypoint.sh"]
Where setup_processor.py downloads directly from S3 to /usr/bin.
So as of now there is no direct feature like this. But there is a workaround to your solution.
Add Build argument before your download step
ARG BUILD_ON=now
# Download/Install processor and associated libs
RUN python3 setup_processor.py
While building the image use below
docker build --build-arg BUILD_ON=$(date) ....
This will always make sure that you get a change in the ARG step and all steps cache after that will be invalidated
A feature has already been requested and being worked out on below thread
https://github.com/moby/moby/issues/1996

Resources