Google Cloud Platform - AI Notebook is deleted after instance is stopped - docker

I've set up an AI Platform Notebook Instance using a custom container with the below Dockerfile. I can access the notebook via the JupyterLabs interface. But, when I save everything and stop the notebook, and then turn it back on. I lose all of my files.
I cannot figure out where to set this in the GCP Console or my Dockerfile.
Any advice would be greatly appreciated.
Dockerfile:
FROM osgeo/gdal:ubuntu-small-3.0.4
ARG NB_USER="root"
ARG NB_UID="1000"
ARG NB_GID="100"
USER root
RUN apt-get update && apt-get install -y build-essential --no-install-recommends \
ca-certificates \
python3-pip \
unzip \
wget \
python3-rtree \
python-numpy \
git \
gdal-bin \
libgtk2.0-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /root
RUN mkdir /root/work
# update pip
RUN python3 -m pip install pip --upgrade \
&& python3 -m pip install wheel \
&& python3 -m pip install pip setuptools \
&& python3 -m pip install notebook==6.0.0 \
&& python3 -m pip install jupyterhub==1.0.0 \
&& python3 -m pip install jupyterlab==1.1.3
RUN python3 -m pip install git+https://github.com/toblerity/shapely.git#master#egg=shapely-1.7.1dev \
&& python3 -m pip install rasterio \
&& python3 -m pip install geopandas \
&& python3 -m pip install descartes \
&& python3 -m pip install solaris \
&& python3 -m pip install rio-tiler
EXPOSE 8080
CMD ["jupyter", "lab","--ip", "0.0.0.0", "--allow-root"]
COPY start.sh /usr/local/bin/
COPY start-notebook.sh /usr/local/bin/
COPY start-singleuser.sh /usr/local/bin/
COPY jupyter_notebook_config.py /etc/jupyter/

Related

How to solve python error in docker - Failed building wheel for pyarrow?

I am trying to build in Bamboo and got this error,
Failed to build pyarrow
21-Sep-2022 06:24:14 ERROR: Could not build wheels for pyarrow, which is required to install pyproject.toml-based projects
21-Sep-2022 06:24:15 The command '/bin/sh -c pip install --upgrade pip && pip install pyarrow' returned a non-zero code: 1
21-Sep-2022 06:24:15 =An error occurred when executing task 'DockerBuild'.
This error occurs only when I add pyarrow or fastparquet in requirements.txt.
This is my requirements.txt file:
requests
urllib3
fastapi
uvicorn[standard]
gunicorn
pytest-cov
prometheus-fastapi-instrumentator
prometheus_client
fastapi-health
python-decouple
ecs-logging
fastapi_health
psycopg2
arrow
anyio
asgiref
certifi
charset-normalizer
click
colorama
h11
idna
python-dotenv
pydantic
sniffio
starlette
typing_extensions
datetime
fastapi_resource_server
sendgrid
PyJWT==2.4.0
bcrypt==3.2.
cryptography==37.0.2
passlib
jose
jira
adal==1.2.7
aiohttp==3.8.1
aiosignal==1.2.0
async-timeout==4.0.2
azure-core==1.25.0
azure-identity==1.10.0
azure-storage-blob==12.13.1
pandas==1.4.4
multidict==6.0.2
numpy==1.23.2
ordered-set==4.1.0
oauthlib==3.2.0
packaging==21.3
python-dateutil==2.8.2
pytz==2022.2.1
requests-oauthlib==1.3.1
six==1.16.0
yarl==1.8.1
Below is my dockerfile:
FROM python:3.10.4-alpine3.15
RUN adduser -D pythonwebapi
WORKDIR /home/pythonwebapi
COPY requirements.txt requirements.txt
COPY logger_config.py logger_config.py
RUN echo 'http://dl-3.alpinelinux.org/alpine/v3.12/main' >> /etc/apk/repositories
RUN apk upgrade && apk add make gcc g++
RUN apk update
RUN apk add libffi-dev
RUN apk add postgresql-dev gcc python3-dev musl-dev
RUN apk add --no-cache musl-dev linux-headers g++
RUN pip install --upgrade pip && pip install arrow && pip install pyarrow
RUN pip install -r requirements.txt && pip install gunicorn
RUN apk del gcc g++ make
COPY app app
COPY init_app.py ./
ENV FLASK_APP init_app.py
RUN chown -R pythonwebapi:pythonwebapi ./
RUN chown -R 777 ./
USER pythonwebapi
EXPOSE 8000 7000
ENTRYPOINT ["gunicorn","--timeout", "1000","init_app:app","-k","uvicorn.workers.UvicornWorker","-b","0.0.0.0"]
Is this error because of the python image?
I am still learning docker so not sure what went wrong here. Can anyone please help me in understanding this?
I have changed the docker file and built it from source since I came to know that pyarrow wheels are not provided for alpine.
FROM python:3.9-alpine
RUN adduser -D pythonwebapi
WORKDIR /home/pythonwebapi
COPY requirements.txt requirements.txt
COPY logger_config.py logger_config.py
RUN echo 'http://dl-3.alpinelinux.org/alpine/v3.9/main' >> /etc/apk/repositories
RUN apk update \
&& apk upgrade \
&& apk add --no-cache build-base \
autoconf \
bash \
bison \
boost-dev \
cmake \
flex \
libressl-dev \
zlib-dev
RUN apk add make gcc g++
RUN apk add libffi-dev
RUN apk add postgresql-dev gcc python3-dev musl-dev
RUN pip install --upgrade pip && pip install -r requirements.txt && pip install gunicorn
RUN apk del gcc g++ make
RUN pip install --no-cache-dir six pytest numpy cython
RUN pip install --no-cache-dir pandas
ARG ARROW_VERSION=3.0.0
ARG ARROW_SHA1=c1fed962cddfab1966a0e03461376ebb28cf17d3
ARG ARROW_BUILD_TYPE=release
ENV ARROW_HOME=/usr/local \
PARQUET_HOME=/usr/local
#Download and build apache-arrow
RUN mkdir -p /arrow \
&& wget -q https://github.com/apache/arrow/archive/apache-arrow-${ARROW_VERSION}.tar.gz -O /tmp/apache-arrow.tar.gz \
&& echo "${ARROW_SHA1} *apache-arrow.tar.gz" | sha1sum /tmp/apache-arrow.tar.gz \
&& tar -xvf /tmp/apache-arrow.tar.gz -C /arrow --strip-components 1 \
&& mkdir -p /arrow/cpp/build \
&& cd /arrow/cpp/build \
&& cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
-DOPENSSL_ROOT_DIR=/usr/local/ssl \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DARROW_WITH_BZ2=ON \
-DARROW_WITH_ZLIB=ON \
-DARROW_WITH_ZSTD=ON \
-DARROW_WITH_LZ4=ON \
-DARROW_WITH_SNAPPY=ON \
-DARROW_PARQUET=ON \
-DARROW_PYTHON=ON \
-DARROW_PLASMA=ON \
-DARROW_BUILD_TESTS=OFF \
.. \
&& make -j$(nproc) \
&& make install \
&& cd /arrow/python \
&& python setup.py build_ext --build-type=$ARROW_BUILD_TYPE --with-parquet \
&& python setup.py install \
&& rm -rf /arrow /tmp/apache-arrow.tar.gz
COPY app app
COPY init_app.py ./
ENV FLASK_APP init_app.py
RUN chown -R pythonwebapi:pythonwebapi ./
RUN chown -R 777 ./
USER pythonwebapi
EXPOSE 8000 7000
ENTRYPOINT ["gunicorn","--timeout", "5000","init_app:app","-k","uvicorn.workers.UvicornWorker","-b","0.0.0.0","-m 3000m"]

Entrypoint not found when deployed to Fargate. Locally works

I have the following Dockerfile, currently working locally in my device:
FROM python:3.7-slim-buster
WORKDIR /app
COPY . /app
VOLUME /app
RUN chmod +x /app/cat/sitemap_download.py
COPY entrypoint.sh /app/entrypoint.sh
RUN chmod +x /app/entrypoint.sh
ARG VERSION=3.7.4
RUN apt update && \
apt install -y bash wget && \
wget -O /tmp/nordrepo.deb https://repo.nordvpn.com/deb/nordvpn/debian/pool/main/nordvpn-release_1.0.0_all.deb && \
apt install -y /tmp/nordrepo.deb && \
apt update && \
apt install -y nordvpn=$VERSION && \
apt remove -y wget nordvpn-release
RUN apt-get clean \
&& apt-get -y update
RUN apt-get -y install python3-dev \
python3-psycopg2 \
&& apt-get -y install build-essential
RUN pip install --upgrade pip
RUN pip install -r cat/requirements.txt
RUN pip install awscli
ENTRYPOINT ["sh", "-c", "./entrypoint.sh"]
But when I deploy it to Fargate, the container stops before reaching the steady state with:
sh: 1: ./entrypoint.sh: not found
Edit: Adding entrypoint.sh file for clarification:
#!/bin/env sh
# start process, but it should exit once the file is in S3
/app/cat/sitemap_download.py
# Once the process is done, we are good to scale down the service
aws ecs update-service --cluster cluster_name --region eu-west-1 --service service-name --desired-count 0
I have tried modifying ENTRYPOINT to use it as exec form, or with full path but always get the same issue. Any ideas on what am I doing wrong?
I've managed to fix it now.
Changing the Dockerfile to look as follows solves the issue:
COPY . /app
VOLUME /app
RUN chmod +x /app/cat/sitemap_download.py
COPY entrypoint.sh /app/entrypoint.sh
RUN chmod +x /app/entrypoint.sh
ARG VERSION=3.7.4
RUN apt update && \
apt install -y bash wget && \
wget -O /tmp/nordrepo.deb https://repo.nordvpn.com/deb/nordvpn/debian/pool/main/nordvpn-release_1.0.0_all.deb && \
apt install -y /tmp/nordrepo.deb && \
apt update && \
apt install -y nordvpn=$VERSION && \
apt remove -y wget nordvpn-release
RUN apt-get clean \
&& apt-get -y update
RUN apt-get -y install python3-dev \
python3-psycopg2 \
&& apt-get -y install build-essential
RUN pip install --upgrade pip
RUN pip install -r cat/requirements.txt
RUN pip install awscli
ENTRYPOINT ["/bin/bash"]
CMD ["./entrypoint.sh"]
I tried this after reading: What is the difference between CMD and ENTRYPOINT in a Dockerfile?
I believe this syntax fixes it because with entrypoint I'm indicating bash to be run at start, and then passing the script as parameter.

Python write permission denied in docker container

I'm trying to use pandas to write a csv to file in my flask app (wrapped with uwsgi) but I keep getting a permission denied error, despite adding uwsgi to my user-group, have I missed something?
docker buildfile:
FROM kennethreitz/pipenv as build
ADD . /app
WORKDIR /app
RUN pipenv install --deploy --system \
&& pipenv lock -r > requirements.txt \
&& pipenv run python setup.py bdist_wheel
FROM ubuntu:bionic
COPY --from=build /app/dist/*.whl .
ARG DEBIAN_FRONTEND=noninteractive
RUN set -xe \
&& apt-get update -q \
&& apt-get install -y -q \
python3-wheel \
python3-pip \
uwsgi-plugin-python3 \
&& python3 -m pip install *.whl \
&& apt-get remove -y python3-pip python3-wheel \
&& apt-get autoremove -y \
&& apt-get clean -y \
&& rm -f *.whl \
&& rm -rf /var/lib/apt/lists/* \
&& mkdir -p /app \
&& useradd _uwsgi --no-create-home --user-group
USER _uwsgi
ENTRYPOINT ["/usr/bin/uwsgi", \
"--master", \
"--enable-threads", \
"--die-on-term", \
"--plugin", "python3"]
CMD ["--http-socket", "0.0.0.0:8000", \
"--processes", "4", \
"--chdir", "/app", \
"--check-static", "static", \
"--module", "server:app"]

The command '/bin/sh -c conda update conda' returned a non-zero code: 127

I’m on a Mac and trying to build a new container (new to Docker), I can get Anaconda installed fine and updating Anaconda from within the container works, however when I try to run conda update conda from the Dockerfile I get the below error:
What am I doing wrong?
Thanks!
The command '/bin/sh -c conda update conda' returned a non-zero code: 127
FROM ubuntu:18.04
RUN apt-get update && \
apt-get -y install curl && \
apt-get -y install python3 && \
apt-get -y install python3-pip && \
python3 -m pip install --upgrade pip && \
apt-get -y install wget && \
apt-get -y install vim && \
pip3 install tensorflow && \
pip3 install keras
RUN wget --quiet https://repo.anaconda.com/archive/Anaconda3-5.3.0-Linux-x86_64.sh -O ~/anaconda.sh && \
/bin/bash ~/anaconda.sh -b -p /opt/conda && \
rm ~/anaconda.sh && \
ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
echo "conda activate base" >> ~/.bashrc
RUN conda update conda
RUN conda install opencv
RUN conda install matplotlib
RUN conda install pandas
RUN conda install seaborn

ImportError: No module named package using pip in Docker

I'm using Docker to build my application. I'm using pip to install the packages from requirements.txt but the package is not including in the build.
FROM python:3.4
WORKDIR /app
ADD . /app
RUN apt-get update && apt-get install -y \
python3-pip python-pip\
cron \
unixodbc \
unixodbc-dev \
python3-dev \
python3-setuptools \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip
RUN pip install sendgrid
RUN pip3 install -r requirements.txt
ENV CONFIG_ENV .env
ADD validator-cron /etc/cron.d/validator-cron-job
RUN chmod 0644 /etc/cron.d/validator-cron-job
RUN touch /var/log/cron.log
CMD cron && tail -f /var/log/cron.log
I'm installing sendgrid using pip but I'm getting ImportError: no module found error.
I have resolved the issue.
It is coming because of the python path.

Resources