How to rebuild docker container in air gapped environment? - docker

I have a fastapi appplication which is to be containerised. I created the docker image from a system with internet connectivity and saved the file (tar archive). This image was loaded in a system with docker installed using docker load command which has no internet connectivity and is working fine. But now I want to make changes to the application code and rebuild the image. Only the app changes have to be pushed. How can this be achieved from this isolated system?

There are two actions during the build that need internet connection.
The first one is pulling the base image for your Dockerfile.
So for example if your Dockerfile is something like:
FROM python:3.9
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
COPY ./app /code/app
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]
Then you would need the python:3.9 docker image on the system.
This is easily achievable by moving images using docker load as you described in the question.
The second is pip installing packages (in the previous case the step RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt).
To do this install in a system with no internet connection you would need to download the .whl wheel file for each requirement and install them using --find-links /path/to/wheel/dir/ (and probably --no-index) flags.
This can become complicated but if your dependencies are more or less fixed you can do the following:
First on the system that CAN connetc to the internet you build a base image with all your requirements:
FROM python:3.9
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
Then you can build this image and load it on the system with no internet. On that you can create a new Dockerfile that starts from your newly created image and just adds your code:
FROM your-base-image
COPY ./app /code/app
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]
Then rebuilding this image should not need any internet.

Related

Installing Rust for Dockerfile

I'm new to Docker, so apologies for this poor question.
I'm using an M1 Mac, and my dockerfile looks like this:
FROM python:3.8.1-slim
ENV PYTHONUNBUFFERED 1
EXPOSE 8000
WORKDIR /app
COPY ./requirements.txt .
COPY ./src .
RUN pip install --verbose -r requirements.txt
CMD ["uvicorn", "--host", "0.0.0.0", "--port", "8000", "src.main:app"]
When I run docker build -t project . I get an error message that includes
Cargo, the Rust package manager, is not installed or is not on PATH.
I've tried adding cargo and rust to requirements.txt and playing with ENV PATH to no avail; which cargo on the host machine returns
/opt/homebrew/bin/cargo
Can someone please point me in the right direction?
Edit: I don't know why Rust is required here, but it seems like it isn't uncommon... this is where it shows up in the error:
Running command /usr/local/bin/python /usr/local/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpfzj95ykv
Checking for Rust toolchain....
Edit 2: I reduced the number of packages in requirements.txt and that seems to have fixed it for now. Still annoyed I can't tell from the error what the issue is, and curious what the fix would be...

Flask and React App in single Docker Container

Good day SO,
I know this is bad practice and that I am supposed to have one App per container, but is there a way for me to have two services running concurrently in the same container, and how would I go about writing the Dockerfile for it?
My current Dockerfile for the Flask (Backend) App:
FROM python:3.6.9-slim-buster
WORKDIR /app/flask_backend
ENV PYTHONPATH "${PYTHONPATH}:/app"
COPY ./flask_backend ./
COPY requirements.txt .
RUN pip install -r requirements.txt
CMD python3 app/webapp/app.py
My React (Frontend) Dockerfile:
FROM node:12.18.0-alpine as build
WORKDIR /app/react_frontend
ENV PATH /app/node_modules/.bin:$PATH
ENV NODE_OPTIONS="--max-old-space-size=8192"
COPY ./react_frontend/package.json ./
COPY ./react_frontend/package-lock.json ./
RUN npm ci
RUN npm install react-scripts#3.4.1 -g
RUN npm install serve -g
COPY ./react_frontend ./
CMD ["serve", "-s", "build", "-l", "3000"]
My attempt to launch both apps within the same Docker Container was to merge the two Dockerfiles, but the resulting container does not have the data from the first Dockerfile, and I am unsure how to proceed.
My merged Dockerfile:
FROM python:3.6.9-slim-buster
WORKDIR /app/flask_backend
ENV PYTHONPATH "${PYTHONPATH}:/app"
COPY ./flask_backend ./
COPY requirements.txt .
RUN pip install -r requirements.txt
CMD python3 app/webapp/app.py
FROM node:12.18.0-alpine as build
WORKDIR /app/react_frontend
ENV PATH /app/node_modules/.bin:$PATH
ENV NODE_OPTIONS="--max-old-space-size=8192"
COPY ./react_frontend/package.json ./
COPY ./react_frontend/package-lock.json ./
RUN npm ci
RUN npm install react-scripts#3.4.1 -g
RUN npm install serve -g
COPY ./react_frontend ./
CMD ["serve", "-s", "build", "-l", "3000"]
I am a beginner in using Docker, and hence I forsee that there will be several problems, such as communications between the two apps (Backend uses port 5000), using this method. Any guidiance will be greatly appreciated!
A React application doesn't usually have a server per se (development-only Docker setups aside). Instead, you run a tool like Webpack to compile it down to static files, which you can then serve to the browser, which then runs them.
On your host system you'd run something like
yarn build
which produces a dist directory; then you'd copy this into your Flask static directory.
If you do this entirely ahead-of-time, then you can run your application out of a Python virtual environment, which will be a much easier development and test setup, and the Dockerfile you show won't change.
If you want to build this entirely in Docker (for example to take advantage of a more Docker-native automated build system) a multi-stage build matches well here. You can use a first stage to build the front-end application, and then COPY that into the final application in the second stage. That looks roughly like:
FROM node:12.18.0-alpine as build
WORKDIR /app/react_frontend
COPY ./react_frontend/package.json ./
COPY ./react_frontend/package-lock.json ./
RUN npm ci
COPY ./react_frontend ./
RUN npm run build
FROM python:3.6.9-slim-buster
WORKDIR /app/flask_backend
ENV PYTHONPATH "${PYTHONPATH}:/app"
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY ./flask_backend ./
COPY --from=build /app/react_frontend/dist/ ./static/
CMD python3 app/webapp/app.py
This approach is not compatible with setups that overwrite Docker image contents using bind mounts. A non-Docker host Node and Python setup will be a much easier development environment, and for this particular setup isn't likely to be substantially different from the Docker setup.

How COPY in separate lines helps with less cache invalidations?

Docker documentation suggests the following as best practice.
If you have multiple Dockerfile steps that use different files from
your context, COPY them individually, rather than all at once. This
ensures that each step’s build cache is only invalidated (forcing the
step to be re-run) if the specifically required files change.
For example:
COPY requirements.txt /tmp/
RUN pip install --requirement /tmp/requirements.txt
COPY . /tmp/
Results in fewer cache invalidations for the RUN step, than if you put
the COPY . /tmp/ before it.
My question is how, how does it help?
In either cases, if requirement.txt file doesn't change then pip install would fetch me the same result, so why does it matter that in best practice scenario, the requirement.txt is the only file in directory (while doing pip install)?
On the other hand, it creates one more layer in the image, which is
something I would not want.
Say you have a very simple application
$ ls
Dockerfile main.py requirements.txt
With the corresponding Dockerfile
FROM python:3
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["./main.py"]
Now say you only change the main.py script. Since the requirements.txt file hasn't changed, the RUN pip install ... can reuse the Docker image cache. This avoids re-running pip install, which can download a lot of packages and take a while.

Docker: is it possible not to build from scratch w/o using cache?

I had a simple Docker file:
FROM python:3.6
COPY . /app
WORKDIR /app
RUN pip install -r requirements
The problem was - it installs requirements on every build. I have a lot of requirements, but they rarely change.
I searched for solutions and ended up with this:
FROM python:3.6
COPY requirements.txt /app/requirements.txt
WORKDIR /app
RUN pip install -r requirements.txt
COPY . /app
That worked perfectly fine, till moment it stopped updating the code. E.g., comment couple lines in some file that goes to /app and build - lines stays uncommented in image.
I searched again and found out that this is possibly caused by cache. I tried --no-cache build flag, but now I'm getting requirements installation again.
Is there some workaround or right way to do it in my situation?
You should use ADD not COPY if you want to invalidate cache.
FROM python:3.6
COPY requirements.txt /app/requirements.txt
WORKDIR /app
RUN pip install -r requirements.txt
ADD . /app
Try the above docker file.
Have you ever used docker-compose? Docker-compose has 'volumes', it's as a cache, and when you start container, It will not re-build your dependencies. It auto refresh when your code changes.
and with your situation, you should do like this:
FROM python:3.6
WORKDIR /app
COPY . /app
COPY requirements.txt ./
RUN pip install -r requirements.txt
CMD["python","app.py"]
Let try.
Changing a file that you simply copy in (COPY . /app) will not be seen by Docker, so it will use a cached layer *, hence your result. Using --no-cache will force a re-build of every layer, again explaining what you've observed.
The 'docker' way to avoid re-installing all requirements every time would be to put all the static requirements in a base image, then use this image in your FROM line with all the other requirements which do change.
* Although, I'm fairly sure I've observed that if you copy a named file, as opposed to a directory, changes are picked up even without --no-cache

how to use pip to install pkg from requirement file without reinstall

I am trying to build an Docker image. My Dockerfile is like this:
FROM python:2.7
ADD . /code
WORKDIR /code
RUN pip install -r requirement.txt
CMD ["python", "manage.py", "runserver", "0.0.0.0:8300"]
And my requirement.txt file like this:
wheel==0.29.0
numpy==1.11.3
django==1.10.5
django-cors-headers==2.0.2
gspread==0.6.2
oauth2client==4.0.0
Now, I have a little change in my code, and i need pandas, so i add it in to requirement.txt file
wheel==0.29.0
numpy==1.11.3
pandas==0.19.2
django==1.10.5
django-cors-headers==2.0.2
gspread==0.6.2
oauth2client==4.0.0
pip install -r requirement.txt will install all packages in that file, although almost of them has installed before. My question is how to make pip install pandas only? That will save the time to build image.
Thank you
If you rebuild your image after changing requirement.txt with docker build -t <your_image> ., I guess it cann't be done because each time when docker runs docker build, it'll start an intermediate container from base image, and it's a new environment so pip obviously will need to install all of dependencies.
You can consider to build your own base image on python:2.7 with common dependencies pre-installed, then build your application image on your own base image. Once there's a need to add more dependencies, manually re-build the base image on the previous one with only extra dependencies installed, and then maybe optionally docker push it back to your registry.
Hope this could be helpful :-)

Resources