Is it possible create a base image from a file? - docker

repo with a few services and in each service, I have the following base code:
FROM python:3.8.13-slim-bullseye
WORKDIR /usr/app
RUN apt-get update
RUN apt-get install default-libmysqlclient-dev build-essential -y
RUN python -m pip install --upgrade pip
RUN pip install pipenv setuptools
This is a little slow to rebuild each time, and sometimes I need to drop all images, so the idea is to know if is possible, to create Dockerfile as a base image and import this from another docker file in order to build these steps only one time locally.
Thanks

Related

Pypy datascience docker image

I have some datascience projects running in docker containers (I use k8s). I am trying to speed up my code by using pypy as my interpreter, but this has been a nightmare.
My OS is ubuntu 20.04
The main libraries I need are:
SQLAlchemy
SciPy
gRPC
For grpc I'm using grpclib, and for SciPy I'm installing it using the miniconda docker image.
My final hurdle is installing psycopg2cffi to make SQLAlchemy work, but after a couple of all-nighters I still haven't managed to make this work. I can install it, but when I run I get a SCRAM authentication problem that I've seen others also get.
Is there a pypy docker file someone has already created that has datascience libraries in it? Doesn't seem like it would be something no one has tried to be before..
Here's by dockerfile so far:
FROM conda/miniconda3 as base
# Setp conda env with pypy3 as the interpreter
RUN conda create -c conda-forge -n pypy-env pypy python=3.8 -y
ENV PATH="/usr/local/envs/pypy-env/bin:$PATH"
RUN pypy -m ensurepip
RUN apt-get -y update && \
apt-get -y install build-essential g++ python3-dev libpq-dev
# Install big/annoying libraries first
RUN pip install psycopg2cffi -y
RUN conda install scipy -y
RUN pip install numpy
WORKDIR /home
COPY ./core/requirements/requirements.txt .
COPY ./core/requirements/basic_requirements.txt .
RUN pip install -r ./requirements.txt
FROM python:3.8-slim as final
WORKDIR /home
COPY --from=base /usr/lib/x86_64-linux-gnu/libpq* /usr/lib/x86_64-linux-gnu/
COPY --from=base /usr/local/envs/pypy-env /usr/local/envs/pypy-env
ENV PATH="/usr/local/envs/pypy-env/bin:$PATH"
COPY .env .env
COPY .src/ .

How to add copy files to Docker image

I'm running a python script to manipulate pictures. When I run the test the system obviously does not find the images. I new to docker and don't really understand how to do that.
This is how the structure looks
And this is the dockerfile
FROM ubuntu:latest
RUN apt update
RUN apt install python3 -y
RUN apt-get -y install python3-pip
RUN pip install pillow
RUN pip install wand
RUN DEBIAN_FRONTEND="noninteractive" apt-get install libmagickwand-dev --no-install-recommends -y
WORKDIR /usr/app/src
COPY image_converter.py ./
COPY test_image_converter.py ./
RUN python3 -m unittest test_image_converter.py
COPY image_converter.py ./
COPY test_image_converter.py ./
here you've COPIED ****.py file to ./(== WORKDIR)
so, in the same way copy your image files to your WORKDIR
such as
COPY test_images ./
This should work
IDK if this will help, but try to look its content.
https://www.geeksforgeeks.org/copying-files-to-and-from-docker-containers/
First, you didn't copy the image folder, should have these commands:
WORKDIR /usr/app/src
COPY test_images ./
Second, you should combine same type of Docker commands (to reduce layers, hence reduce docker image's size)
example : instead of saying
COPY image_converter.py ./
COPY test_image_converter.py ./
You can write
COPY *.py ./
Another example :
RUN apt update
RUN apt install python3 -y
You can write
RUN apt update; apt install python3 -y

Docker image Package Patch within Dockerfile

I have below docker image, where I need to update patch to curl package, in below Docker image, in Line number 3 I am already doing update, but it is shown up in Vulnerabilities report.
I have added, RUN yum -y update curl at the end of Dockerfile, then it is not showing up in Vulnerabilities report.
Any fix?, All Packages must install with latest version, I dont want to be mention explicitly
or any mistakes in Dockerfile?
FROM centos:7 AS base
FROM base AS build
# Install all dependenticies
RUN yum -y update \
&& yum install -y openssl-devel bzip2-devel libffi-devel \
zlib-devel wget gcc make
# Below compile python from source
FROM base
ENV LD_LIBRARY_PATH=/usr/local/lib64:/usr/local/lib
COPY --from=build /usr/local/ /usr/local/
# Copy Code
COPY . /app/
WORKDIR /app
#Install code dependecies.
RUN /usr/local/bin/python -m pip install --upgrade pip \
&& pip install -r requirements.txt
# Why, I need this step, when I already update RUN in line 3?, If I won't perform I see in compliance report, any fix?
RUN yum -y update curl
# run Application
ENTRYPOINT ["python"]
CMD ["test.py"]
In order to understand what constitutes an image, you need to look at a Dockerfile in a different way:
Every step (with the exception of FROM) creates a new image, with the results of the previous step as a base.
FROM doesn't use the previous step, but an explicitly specified one.
Now, looking at your Dockerfile, you seem to wonder why RUN yum -y update curl doesn't work as expected. For easier understanding, let's trace it backwards:
RUN yum -y update curl
RUN /usr/local/bin/python -m pip install --upgrade pip \ && pip install -r requirements.txt
WORKDIR /app
COPY . /app/
COPY --from=build /usr/local/ /usr/local/
ENV LD_LIBRARY_PATH=/usr/local/lib64:/usr/local/lib
FROM base -- at this point, the previous step is changed to the last step of base
FROM centos:7 AS base -- here, the previous step is changed to centos:7
As you see, nowhere in the earlier steps is yum update -y curl!
BTW: Typing this, I'm wondering what your precise question is, i.e. whether this works or doesn't or whether you wonder why it's necessary. Are you aware of the difference between yum update and yum update curl even?
docker build and friends have a cache system, based on the text of the input. So if the text of the command yum -y update doesn't change, it will continue using the same cached version of the output forever (or until the cache is deleted). Try running the build with --no-cache and see if that helps.

Packages wont install in docker

I am trying to install tesseract-ocr to docker from a Dockerfile. When I build the Dockerfile everything looks normal and i get no errors but when I run the container tesseract is not installed.
If I access the container using sudo docker exec -t -i <container_id> /bin/bash and manually install tesseract using apt-get install -y tesseract-ocr-all it installs and works perfectly. Why doesn't it work when I try to install it during the build process?
My Dockerfile looks like this:
FROM ubuntu:20.04
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
&& apt-get install -y tesseract-ocr-all
RUN tesseract --version
FROM python:3.7
WORKDIR ocr
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt
COPY . .
Thanks!
It looks like you are leveraging Docker multi-stage builds without realizing it.
When you put FROM python:3.7, you essentially throw away everything you have done above that, since you start a new stage.
The easiest solution I can see is to move
RUN apt-get update \
&& apt-get install -y tesseract-ocr-all
RUN tesseract --version
into the FROM python:3.7 stage, and remove the FROM ubuntu:20.04 stage.
You need to switch your user, since you likely don't have permission to run those commands. Something like this should work:
USER root
RUN apt-get update \
&& apt-get install -y tesseract-ocr-all
USER <switch back to previous user>
You'll need to figure out what the default user is to switch back, which you can probably find in the Ubuntu docs or using whoami.

Unable to locate package python

I'm trying to run the below Dockerfile contents on ubuntu image.
FROM ubuntu
RUN apt-get update
RUN apt-get install -y python
RUN apt-get install -y python-pip
RUN pip install flask
COPY app.py /opt/app.py
ENTRYPOINT FLASK_APP=/opt/app.py flask run --host=0.0.0.0
But i'm getting the below error at layer 3.
Step 1/7 : FROM ubuntu
---> 549b9b86cb8d
Step 2/7 : RUN apt-get update
---> Using cache
---> 78d87d6d9188
Step 3/7 : RUN apt-get install -y python
---> Running in a256128fde51
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package python
Athough while I run the below command individually
sudo apt-get install -y python
It's successfully getting installed.
Can anyone please help me out.
Note: I'm working behind organization proxy.
Step 2/7 : RUN apt-get update
---> Using cache
You should run apt-get update and apt-get install in the same RUN instruction as follows
RUN apt-get update && apt-get install -y python
Each instruction in a Dockerfile will create separate layer in the image and the layers are cached. So the apt-get update might just use cache and not even run. This has happened in your case as well. You can see the line ---> Using cache in your logs. You can use docker build --no-cache to make docker rebuild all the layers without using cache.
You can instead just use python:3 official image as base image to run your python apps.
In the python installation tutorial there is a package name python3.x for Debian.
I think this is your case. I tested it in the Docker and this is right configuration
looks like
From ubuntu:20.04
RUN apt-get update && apt-get -y install python3.8 python3.8-dev
I feel you should rather use Python3 's image instead of using ubuntu and then installing it.
FROM python:3
RUN apt-get update && apt-get install -y python3-pip #You don't need to install pip, because it is already there in python's image
RUN pip install -r requirements.txt
COPY . /usr/src/apps #This you can change
WORKDIR /usr/src/apps/ #this as well
CMD ["python","app.py"]
Try the following, it worked for me
apt-get install python3-pip
And then for installing flask you will have to use
pip3 install flask

Resources