Docker run giving 'No module named 'pandas'' - docker

This is my docker file
FROM public.ecr.aws/i7d4o1h8/miniconda3:4.10.3p0
RUN pip install --upgrade pip
COPY condaEnv.yml .
RUN conda env create -f condaEnv.yml python=3.9.7
RUN pip install sagemaker-inference
COPY inference_code.py /opt/ml/code/
ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/code/
ENV SAGEMAKER_PROGRAM inference_code.py
ENTRYPOINT ["python", "/opt/ml/code/inference_code.py"]
When I run docker build with the command docker build -t docker_name ., it is successful, at the end I see Successfully tagged docker_name:latest
But when I am trying to run the docker image it gives
Traceback (most recent call last):
File "/opt/ml/code/inference_code.py", line 4, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
But in the condaEnv.yml file I have the pandas defined as
name: plato_vrar
channels:
- conda-forge
- defaults
dependencies:
- pandas=1.3.4
- pip=21.2.4
prefix: plato_vrar/
What am I missing here?

In anaconda, creating an environment means only the environment is prepared. You need also to activete it using conda activate <ENV_NAME> then python is correctly linked to the anaconda version rather than the system version. Refer to the conda document for more details: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#activating-an-environment

Related

Missing CV2 in Docker container

I've made the following Dockerfile to build a python application:
FROM python:3.7
WORKDIR /app
# Install python dependencies
ADD requirements.txt /app/
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
# Copy sources
ADD . /app
# Run detection
CMD ["detect.py" ]
ENTRYPOINT ["python3"]
The requirements.txt file contains only a few dependencies, including opencv:
opencv-python
opencv-python-headless
filterpy==1.1.0
lap==0.4.0
paho-mqtt==1.5.1
numpy
Pillow
Building the Docker image works perfectly.
When I try to run the image, I got the following error:
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "detect.py", line 6, in <module>
import cv2
File "/usr/local/lib/python3.7/site-packages/cv2/__init__.py", line 5, in <module>
from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
Seems like the CV2 dependency is not satisfied.
Is there something I missed building the Dockerfile ?
I have tried to replace opencv-python-headless by python3-opencv but there is no matching distribution found.
libGL.so.1 could be found in libgl1, so you could add next to Dockerfile:
RUN apt update; apt install -y libgl1
Typically, docker images may remove many libraries to make the size small, these dependencies most probably could be already installed in your host system.
So, for me, I usually use dpkg -S in host system to find which package I needed, then install them in container:
shubuntu1#shubuntu1:~$ dpkg -S libGL.so.1
libgl1:amd64: /usr/lib/x86_64-linux-gnu/libGL.so.1
libgl1:amd64: /usr/lib/x86_64-linux-gnu/libGL.so.1.0.0

importing cuml in a docker image

I'm building a custom rapidsai docker image based on its devel image. Here is the docker file.
FROM rapidsai/rapidsai-dev:0.19-cuda11.0-devel-ubuntu18.04-py3.7
# Defining working directory and adding source code
WORKDIR /usr/src/app
RUN echo "Make sure cuml is installed:"
RUN python -c "import cuml"
But when I built it with this command,
nvidia-docker build . -t test
it returns an error saying:
Step 4/4 : RUN python -c "import cuml"
---> Running in 553d12bf7e68
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'cuml'
It seems that it can't recognize cuml library which is already a part of the libraries of the base image. Why it can't import it?
Just fixed it. This will work without any issue
FROM rapidsai/rapidsai-dev:0.19-cuda11.0-devel-ubuntu18.04-py3.7
# Defining working directory and adding source code
WORKDIR /usr/src/app
RUN echo "conda activate rapids" >> ~/.bashrc
SHELL ["/bin/bash", "--login", "-c"]
then install some libraries..and
RUN export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/libcuda.so.1:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
RUN export LD_LIBRARY_PATH=/usr/local/cuda-11.0/compat/libcuda.so.1:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
COPY test_cuml.py .
RUN echo "conda activate rapids" >> ~/.bashrc
SHELL ["/bin/bash", "--login", "-c"]
ENTRYPOINT ["/opt/conda/envs/rapids/bin/python", "test_cuml.py"]

Pandas not found in Docker Run Command [attaching volumne]

When I build my docker image and run it using the following commands:
docker build -t iter1 .
docker run -it --rm --name iter1_run iter1
My application runs just fine. However when I try to attach a volume and execute the following command:
docker run -it --rm --name iter_run -v /Users/xxxx/Desktop/Docker_Builds/SingleDocker/xxxxxx:/usr/src/oce -w /usr/src/oce python:3 python oce_test.py
The file oce_test.py cant find Pandas.
Traceback (most recent call last):
File "oce_test.py", line 1, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
The content of my Dockerfile is as follows:
# Docker image
FROM python:3
# Copy requirements
COPY requirements.txt /
# Install Requirements
RUN pip install -r /requirements.txt
# Copy scripts needed for execution
COPY ./xxxx /usr/src/oce
# Establish a working directory
WORKDIR /usr/src/oce
# Execute required script
CMD ["python", "oce_test.py"]
The content of my requirements.txt is as follows:
numpy==1.18.1
pandas==1.0.1
matplotlib==3.1.3
scipy==1.4.1
Python-dateutil==2.8.1
David Maze answered this:
Your docker run command is running a plain python:3 image with no additional packages installed. If you want to use the image from your Dockerfile, but overwriting the application code in the image with arbitrary content from your host, use your image name iter1 instead. (You don't need to repeat the image's WORKDIR or CMD as docker run options.)

Docker multi-stage with AWS CLI

I am setting up a multistage build in Docker where I need to pull some data from a remote image. In that remote image, I see they installed the AWS CLI using the following set of commands in order to get it into an Alpine-based image:
RUN apk --no-cache add python3 && \
pip3 install awscli && \
aws --version
The copy say it's just fine
COPY --from=remote_setup /usr/bin/terraform /usr/bin/terraform
COPY --from=remove_setup /usr/bin/aws /usr/bin/aws
Terraform here runs peachy, but AWS does not. The output looks like this:
/ # terraform -v
Terraform v0.12.2
/ # ls -lh /usr/bin | grep aws
-rwxr-xr-x 1 root root 817 Jun 19 19:51 aws
/ # aws --version
/bin/sh: aws: not found
If I add python3, I then get this error:
/ # aws --version
Traceback (most recent call last):
File "/usr/bin/aws", line 19, in <module>
import awscli.clidriver
ModuleNotFoundError: No module named 'awscli'
Is there a trick to copying over all the data from a command in that particular layer to my new one or for simplicity's sake should I just install Python and the AWS CLI myself in my image?
Thanks!
pip is the standard Python package manager. In addition to installing a wrapper script in /usr/bin (or the current environment's bin directory) it also installs a fair bit of library code under a .../lib/pythonX.Y/site-packages/... tree. Also, packages are allowed to depend on other packages, so it's not going to just be a single directory in the site-packages directory.
In short: you will need the Python interpreter and everything the pip install installs, so you should run that command yourself in your derived image.

Docker build for numpy , pandas giving error

I have a Dockerfile in a directory called docker_test. The structure of docker_test is as follows:
M00618927A:docker_test i854319$ ls
Dockerfile hello_world.py
My dockerfile looks like below:
### Dockerfile
# Created by Baktaawar
# Pulling from base Python image
FROM python:3.6.7-alpine3.6
# author of file
LABEL maintainer="Baktawar"
# Set the working directory of the docker image
WORKDIR /docker_test
COPY . /docker_test
# packages that we need
RUN pip --no-cache-dir install numpy pandas jupyter
EXPOSE 8888
ENTRYPOINT ["python"]
CMD ["hello_world.py"]
I run the command
docker build -t dockerfile .
It starts the building process but then gives the following error in not being able to get the numpy etc installed in the image
Sending build context to Docker daemon 4.096kB
Step 1/8 : FROM python:3.6.7-alpine3.6
---> 8f30079779ef
Step 2/8 : LABEL maintainer="Baktawar"
---> Running in 7cf081021b1e
Removing intermediate container 7cf081021b1e
---> 581cf24fa4e6
Step 3/8 : WORKDIR /docker_test
---> Running in 7c58855c4332
Removing intermediate container 7c58855c4332
---> dae70a34626b
Step 4/8 : COPY . /docker_test
---> 432b174b4869
Step 5/8 : RUN pip --no-cache-dir install numpy pandas jupyter
---> Running in 972efa9336ed
Collecting numpy
Downloading https://files.pythonhosted.org/packages/cf/8d/6345b4f32b37945fedc1e027e83970005fc9c699068d2f566b82826515f2/numpy-1.16.2.zip (5.1MB)
Collecting pandas
Downloading https://files.pythonhosted.org/packages/81/fd/b1f17f7dc914047cd1df9d6813b944ee446973baafe8106e4458bfb68884/pandas-0.24.1.tar.gz (11.8MB)
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 357, in get_provider
module = sys.modules[moduleOrReq]
KeyError: 'numpy'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-8c3o0ycd/pandas/setup.py", line 732, in <module>
ext_modules=maybe_cythonize(extensions, compiler_directives=directives),
File "/tmp/pip-install-8c3o0ycd/pandas/setup.py", line 475, in maybe_cythonize
numpy_incl = pkg_resources.resource_filename('numpy', 'core/include')
File "/usr/local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1142, in resource_filename
return get_provider(package_or_requirement).get_resource_filename(
File "/usr/local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 359, in get_provider
__import__(moduleOrReq)
ModuleNotFoundError: No module named 'numpy'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-8c3o0ycd/pandas/
You are using pip version 18.1, however version 19.0.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
The command '/bin/sh -c pip --no-cache-dir install numpy pandas jupyter' returned a non-zero code: 1
How can I get this basic setup done?
You basically need to install the following on alpine, in order to be able to install numpy:
apk --no-cache add musl-dev linux-headers g++
Try the following Dockerfile:
### Dockerfile
# Created by Baktawar
# Pulling from base Python image
FROM python:3.6.7-alpine3.6
# author of file
LABEL maintainer="Baktawar"
# Set the working directory of the docker image
WORKDIR /app
COPY . /app
# Install native libraries, required for numpy
RUN apk --no-cache add musl-dev linux-headers g++
# Upgrade pip
RUN pip install --upgrade pip
# packages that we need
RUN pip install numpy && \
pip install pandas && \
pip install jupyter
EXPOSE 8888
ENTRYPOINT ["python"]
CMD ["hello_world.py"]
You may find this gist interresting:
https://gist.github.com/orenitamar/f29fb15db3b0d13178c1c4dd611adce2
And this package on alpine, is also of interrest I think:
https://pkgs.alpinelinux.org/package/edge/community/x86/py-numpy
Update
In order to tag the image properly, use the syntax:
docker build -f <dockerfile> -t <tagname:tagversion> <buildcontext>
For you, this would be:
docker build -t mypythonimage:0.1 .

Resources