I'm running dbt-snowflake docker image and need to pass parameters while running the container. So, I tried passing --vars from the command prompt. But getting below error.
12:54:44 Running with dbt=1.3.1
12:54:45 Encountered an error:
'dbt_snowflake://macros/apply_grants.sql'
12:54:45 Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/dbt/main.py", line 135, in main
results, succeeded = handle_and_check(args)
File "/usr/local/lib/python3.10/site-packages/dbt/main.py", line 198, in handle_and_check
task, res = run_from_args(parsed)
File "/usr/local/lib/python3.10/site-packages/dbt/main.py", line 245, in run_from_args
results = task.run()
File "/usr/local/lib/python3.10/site-packages/dbt/task/runnable.py", line 453, in run
self._runtime_initialize()
File "/usr/local/lib/python3.10/site-packages/dbt/task/runnable.py", line 161, in _runtime_initialize
super()._runtime_initialize()
File "/usr/local/lib/python3.10/site-packages/dbt/task/runnable.py", line 94, in _runtime_initialize
self.load_manifest()
File "/usr/local/lib/python3.10/site-packages/dbt/task/runnable.py", line 81, in load_manifest
self.manifest = ManifestLoader.get_full_manifest(self.config)
File "/usr/local/lib/python3.10/site-packages/dbt/parser/manifest.py", line 221, in get_full_manifest
manifest = loader.load()
File "/usr/local/lib/python3.10/site-packages/dbt/parser/manifest.py", line 320, in load
self.load_and_parse_macros(project_parser_files)
File "/usr/local/lib/python3.10/site-packages/dbt/parser/manifest.py", line 422, in load_and_parse_macros
block = FileBlock(self.manifest.files[file_id])
KeyError: 'dbt_snowflake://macros/apply_grants.sql'
Below is my docker file
# Top level build args
ARG build_for=linux/amd64
##
# base image (abstract)
##
FROM --platform=$build_for python:3.10.7-slim-bullseye as base
ARG dbt_core_ref=dbt-core#v1.4.0a1
ARG dbt_postgres_ref=dbt-core#v1.4.0a1
ARG dbt_redshift_ref=dbt-redshift#v1.4.0a1
ARG dbt_bigquery_ref=dbt-bigquery#v1.4.0a1
ARG dbt_snowflake_ref=dbt-snowflake#v1.3.0
ARG dbt_spark_ref=dbt-spark#v1.4.0a1
# special case args
ARG dbt_spark_version=all
ARG dbt_third_party
# System setup
RUN apt-get update \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
git \
ssh-client \
software-properties-common \
make \
build-essential \
ca-certificates \
libpq-dev \
&& apt-get clean \
&& rm -rf \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/*
# Env vars
ENV PYTHONIOENCODING=utf-8
ENV LANG=C.UTF-8
# Update python
RUN python -m pip install --upgrade pip setuptools wheel --no-cache-dir
RUN pip install -q --no-cache-dir dbt-core
RUN pip install -q --no-cache-dir dbt-snowflake
# RUN mkdir /root/.dbt
# ADD profiles.yml /root/.dbt
# Set docker basics
WORKDIR /usr/app/dbt/
VOLUME /usr/app
COPY **/profiles.yml /root/.dbt/profiles.yml
COPY . /usr/app/dbt/
ENTRYPOINT ["dbt"]
Here is my docker image docker pull madhuraju/gu-snowflake
Below is the command
docker run -it gu-snowflake:test run --vars '{"testKey": "testValue"}'
Please let me know how can I fix this issue and also how I can pass values at runtime so that dbt executes only specific models based on the values that are being passed.
Related
I am using streamlit and lightgbm with docker and I am currently getting this error.
OSError: libgomp.so.1: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
exec(code, module.__dict__)
File "/src/stream.py", line 6, in <module>
model = pickle.load(open('model.pkl', 'rb'))
File "/usr/local/lib/python3.10/site-packages/lightgbm/__init__.py", line 8, in <module>
from .basic import Booster, Dataset, Sequence, register_logger
File "/usr/local/lib/python3.10/site-packages/lightgbm/basic.py", line 110, in <module>
_LIB = _load_lib()
File "/usr/local/lib/python3.10/site-packages/lightgbm/basic.py", line 101, in _load_lib
lib = ctypes.cdll.LoadLibrary(lib_path[0])
File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
return self._dlltype(name)
File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
I tried updating my Dockerfile to include libgomp1 alongside the other dependencies listed in this solution: https://stackoverflow.com/a/73140670/14651739
My dockerfile
FROM ubuntu:latest
WORKDIR /src/
RUN apt-get update && apt-get install -y --no-install-recommends apt-utils
RUN apt-get update && \
apt-get install -y --no-install-recommends \
ca-certificates \
cmake \
build-essential \
gcc \
g++ \
git && \
rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get -y install curl
RUN apt-get install libgomp1 -y
# apt-get install curl
FROM python:3.10.0-slim-buster
WORKDIR /src
COPY requirements.txt /src/
COPY --from=0 /src ./
RUN pip3 install -r requirements.txt
EXPOSE 8501
COPY . /src/
ENTRYPOINT ["streamlit", "run"]
CMD [ "stream.py" ]
My requirements.txt file
pandas
sklearn
streamlit
lightgbm
I am using Poetry to install a python project using Poetry in a Docker container. Below you can find my Docker file, which used to work fine until recently when I switched to a new version of Poetry (1.2.1) and the new recommended Poetry installer:
# pull official base image
FROM ubuntu:20.04
ENV PATH = "${PATH}:/home/poetry/bin"
ENV APP_HOME=/home/app/web
RUN apt-get -y update && \
apt upgrade -y && \
apt-get install -y \
python3-pip \
curl \
netcat \
gunicorn && \
rm -fr /var/lib/apt/lists
# alias python2 to python3
RUN ln -s /usr/bin/python3 /usr/bin/python
# Install Poetry
RUN mkdir -p /home/poetry && \
curl -sSL https://install.python-poetry.org | POETRY_HOME=/home/poetry python -
# Cleanup
RUN apt-get remove -y curl && \
apt-get clean
RUN pip install --upgrade pip && \
pip install cryptography && \
pip install psycopg2-binary
# create directory for the app user
# create the app user
# create the appropriate directories
RUN adduser --system --group app && \
mkdir -p $APP_HOME/static-incdtim && \
mkdir -p $APP_HOME/mediafiles
# copy project
COPY . $APP_HOME
WORKDIR $APP_HOME
# Install Python packages
RUN poetry config virtualenvs.create false
RUN poetry install --only main
# copy entrypoint-prod.sh
COPY ./entrypoint.incdtim.prod.sh $APP_HOME/entrypoint.sh
RUN chmod a+x $APP_HOME/entrypoint.sh
# chown all the files to the app user
RUN chown -R app:app $APP_HOME
# change to the app user
USER app
# run entrypoint.prod.sh
ENTRYPOINT ["/home/app/web/entrypoint.sh"]
The poetry install works fine, I attached to a running container and run it myself and found that it works without problems. However, when I open a Python console and try to import a module (django) which is installed by the Poetry project, the module is not found. Please note that I am installing my project in the system environment (poetry config virtualenvs.create false). I verified, and there is only one version of python installed in the docker container. The specific error I get when trying to import a python module installed by Poetry is: ModuleNotFoundError: No module named xxxx
Although this is not an answer, it is too long to fit within the comment section. It is rather a piece of advice:
declare your ENV at the top of the Dockerfile to make it easier to read.
merge the multiple RUN commands together to avoid creating useless intermediate layers. In the particular case of apt-get install, this will also prevent you from installing a package which dates back from the first "apt-get update". Indeed, since the command line has not changed Docker will not re-execute the command and thus not refresh the package list..
avoid making a copy of all the files in "." when you previously copy some specific files to specific places..
Here, you Dockerfile could rather look like:
# pull official base image
FROM ubuntu:20.04
ENV PATH = "${PATH}:/home/poetry/bin"
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN apt-get -y update && \
apt upgrade -y && \
apt-get install -y \
python3-pip \
curl \
netcat \
gunicorn && \
rm -fr /var/lib/apt/lists
# alias python2 to python3
RUN ln -s /usr/bin/python3 /usr/bin/python
# Install Poetry
RUN mkdir -p /home/poetry && \
curl -sSL https://install.python-poetry.org | POETRY_HOME=/home/poetry python -
# Cleanup
RUN apt-get remove -y \
curl && \
apt-get clean
RUN pip install --upgrade pip && \
pip install cryptography && \
pip install psycopg2-binary
# create directory for the app user
# create the app user
# create the appropriate directories
RUN mkdir -p /home/app && \
adduser --system --group app && \
mkdir -p $APP_HOME/static-incdtim && \
mkdir -p $APP_HOME/mediafiles
WORKDIR $APP_HOME
# copy project
COPY . $APP_HOME
# Install Python packages
RUN poetry config virtualenvs.create false && \
poetry install --only main
# copy entrypoint-prod.sh
RUN cp $APP_HOME/entrypoint.incdtim.prod.sh $APP_HOME/entrypoint.sh && \
chmod a+x $APP_HOME/entrypoint.sh && \
chown -R app:app $APP_HOME
# change to the app user
USER app
# run entrypoint.prod.sh
ENTRYPOINT ["/home/app/web/entrypoint.sh"]
UPDATE:
Let's get back to your question. Having your program running okay when you "run it yourself" does not mean all the dependencies are met. Indeed, this can mean that your module has not been imported yet (and thus has not triggered the ModuleNotFoundError exception yet).
In order to validate this theory, you can either:
create a simple application which imports the failing module and then quits. If the import succeeds then there is something weird indeed.
list the installed modules with poetry show --latest. If the package is listed, then there is something weird indeed.
If none of the above indicates the module is installed, that just means the module is not installed and you should update your Dockerfile to install it.
NOTE: I do not know much about poetry, but you may want to have a list external dependencies to be met for your application. In the case of pip3, the list is expressed as a file named requirement.txt and can be installed with pip3 install -r requirement.txt.
It turns out this is known a bug in Poetry: https://github.com/python-poetry/poetry/issues/6459
What I have:
I have set up an Ubuntu VM using Vagrant. Inside this VM, I want to build a Docker Image, which should run some services, which will be connected to some clients outside the VM. This structure is fixed and cannot be changed. One of the Docker images is using ML frameworks, namely tensorflow and pytorch. The source code to be executed inside the Docker image is bundled using pyInstaller. The building and bundling works perfectly.
But, if I try to run the built Docker image, I get the following error message:
[1] WARNING: file already exists but should not: /tmp/_MEIl2gg3t/torch/_C.cpython-37m-x86_64-linux-gnu.so
[1] WARNING: file already exists but should not: /tmp/_MEIl2gg3t/torch/_dl.cpython-37m-x86_64-linux-gnu.so
['/tmp/_MEIl2gg3t/base_library.zip', '/tmp/_MEIl2gg3t/lib-dynload', '/tmp/_MEIl2gg3t']
[8] Failed to execute script '__main__' due to unhandled exception!
Traceback (most recent call last):
File "__main__.py", line 4, in <module>
File "PyInstaller/loader/pyimod03_importers.py", line 495, in exec_module
File "app.py", line 6, in <module>
File "PyInstaller/loader/pyimod03_importers.py", line 495, in exec_module
File "controller.py", line 3, in <module>
File "PyInstaller/loader/pyimod03_importers.py", line 495, in exec_module
File "torch/__init__.py", line 199, in <module>
ImportError: librt.so.1: cannot open shared object file: No such file or directory
Dockerfile
ARG PRJ=unspecified
ARG PYINSTALLER_ARGS=
ARG LD_LIBRARY_PATH_EXTENSION=
ARG PYTHON_VERSION=3.7
###############################################################################
# Stage 1: BUILD PyInstaller
###############################################################################
# Alpine:
#FROM ... as build-pyinstaller
# Ubuntu:
FROM ubuntu:18.04 as build-pyinstaller
ARG PYTHON_VERSION
# Ubuntu:
RUN apt-get update && apt-get install -y \
python$PYTHON_VERSION \
python$PYTHON_VERSION-dev \
python3-pip \
unzip \
# Ubuntu+Alpine:
libc-dev \
g++ \
git
# Make our Python version the default
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python$PYTHON_VERSION 1 && python3 --version
# Alpine:
#
# # Install pycrypto so --key can be used with PyInstaller
# RUN pip install \
# pycrypto
# Install PyInstaller
RUN python3 -m pip install --proxy=${https_proxy} --no-cache-dir \
pyinstaller
###############################################################################
# Stage 2: BUILD our service with Python and pyinstaller
###############################################################################
FROM build-pyinstaller
# Upgrade pip and setuptools
RUN python3 -m pip install --no-cache-dir --upgrade \
pip \
setuptools
# Install pika and protobuf here as they will be required by all our services,
# and installing in every image would take more time.
# If they should no longer be required everywhere, we could instead create
# with-pika and with-protobuf images and copy the required, installed libraries
# to the final build image (similar to how it is done in cpp).
RUN python3 -m pip install --no-cache-dir \
pika \
protobuf
# Add "worker" user to avoid running as root (used in the "run" image below)
# Alpine:
#RUN adduser -D -g "" worker
# Ubuntu:
RUN adduser --disabled-login --gecos "" worker
RUN mkdir -p /opt/export/home/worker && chown -R worker /opt/export/home/worker
ENV HOME /home/worker
# Copy /etc/passwd and /etc/group to the export directory so that they will be installed in the final run image
# (this makes the "worker" user available there; adduser is not available in "FROM scratch").
RUN export-install \
/etc/passwd \
/etc/group
# Create tmp directory that may be required in the runner image
RUN mkdir /opt/export/install/tmp && chmod ogu+rw /opt/export/install/tmp
# When using this build-parent ("FROM ..."), the following ONBUILD commands are executed.
# Files from pre-defined places in the local project directory are copied to the image (see below for details).
# Use the PRJ and MAIN_MODULE arguments that have to be set in the individual builder image that uses this image in FROM ...
ONBUILD ARG PRJ
ONBUILD ENV PRJ=embedded.adas.emergencybreaking
ONBUILD WORKDIR /opt/prj/embedded.adas.emergencybreaking/
# "prj" must contain all files that are required for building the Python app.
# This typically contains a requirements.txt - in this step we only copy requirements.txt
# so that "pip install" is not run after every source file change.
ONBUILD COPY pr[j]/requirements.tx[t] /opt/prj/embedded.adas.emergencybreaking/
# Install required python dependencies for our service - the result stored in a separate image layer
# which is used as cache in the next build even if the source files were changed (those are copied in one of the next steps).
ONBUILD RUN python3 -m pip install --no-cache-dir -r /opt/prj/embedded.adas.emergencybreaking/requirements.txt
# Install all linux packages that are listed in /opt/export/build/opt/prj/*/install-packages.txt
# and /opt/prj/*/install-packages.txt
ONBUILD COPY .placeholder pr[j]/install-packages.tx[t] /opt/prj/embedded.adas.emergencybreaking/
ONBUILD RUN install-build-packages
# "prj" must contain all files that are required for building the Python app.
# This typically contains a dependencies/lib directory - in this step we only copy that directory
# so that "pip install" is not run after every source file change.
ONBUILD COPY pr[j]/dependencie[s]/li[b] /opt/prj/embedded.adas.emergencybreaking/dependencies/lib
# .egg/.whl archives can contain binary .so files which can be linked to system libraries.
# We need to copy the system libraries that are linked from .so files in .whl/.egg packages.
# (Maybe Py)
ONBUILD RUN \
for lib_file in /opt/prj/embedded.adas.emergencybreaking/dependencies/lib/*.whl /opt/prj/embedded.adas.emergencybreaking/dependencies/lib/*.egg; do \
if [ -e "$lib_file" ]; then \
mkdir -p /tmp/lib; \
cd /tmp/lib; \
unzip $lib_file "*.so"; \
find /tmp/lib -iname "*.so" -exec ldd {} \; ; \
linked_libs=$( ( find /tmp/lib -iname "*.so" -exec get-linked-libs {} \; ) | egrep -v "^/tmp/lib/" ); \
export-install $linked_libs; \
cd -; \
rm -rf /tmp/lib; \
fi \
done
# Install required python dependencies for our service - the result is stored in a separate image layer
# which can be used as cache in the next build even if the source files are changed (those are copied in one of the next steps).
ONBUILD RUN \
for lib_file in /opt/prj/embedded.adas.emergencybreaking/dependencies/lib/*.whl; do \
[ -e "$lib_file" ] || continue; \
\
echo "python3 -m pip install --no-cache-dir $lib_file" && \
python3 -m pip install --no-cache-dir $lib_file; \
done
ONBUILD RUN \
for lib_file in /opt/prj/embedded.adas.emergencybreaking/dependencies/lib/*.egg; do \
[ -e "$lib_file" ] || continue; \
\
# Note: This will probably not work any more as easy_install is no longer contained in setuptools!
echo "python3 -m easy_install $lib_file" && \
python3 -m easy_install $lib_file; \
done
# Copy the rest of the prj directory.
ONBUILD COPY pr[j] /opt/prj/embedded.adas.emergencybreaking/
# Show what files we are working on
ONBUILD RUN find /opt/prj/embedded.adas.emergencybreaking/ -type f
# Create an executable with PyInstaller so that python does not need to be installed in the "run" image.
# This produces a lot of error messages like this:
# Error relocating /usr/local/lib/python3.8/lib-dynload/_uuid.cpython-38-x86_64-linux-gnu.so: PyModule_Create2: symbol not found
# If the reported functions/symbols are called from our python service, the missing dependencies probably have to be installed.
ONBUILD ARG PYINSTALLER_ARGS
ONBUILD ENV PYINSTALLER_ARGS=${PYINSTALLER_ARGS}
ONBUILD ARG LD_LIBRARY_PATH_EXTENSION
ONBUILD ENV LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${LD_LIBRARY_PATH_EXTENSION}
ONBUILD RUN mkdir -p /usr/lib64 # Workaround for FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib64' from pyinstaller
ONBUILD RUN \
apt-get update && \
apt-get install -y \
libgl1-mesa-glx \
libx11-xcb1 && \
apt-get clean all && \
rm -r /var/lib/apt/lists/* && \
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" && \
echo "pyinstaller -p /opt/prj/embedded.adas.emergencybreaking/src -p /opt/prj/embedded.adas.emergencybreaking/dependencies/src -p /usr/local/lib/python3.7/dist-packages --hidden-import=torch --hidden-import=torchvision --onefile ${PYINSTALLER_ARGS} /opt/prj/embedded.adas.emergencybreaking/src/adas_emergencybreaking/__main__.py" && \
pyinstaller -p /opt/prj/embedded.adas.emergencybreaking/src -p /opt/prj/embedded.adas.emergencybreaking/dependencies/src -p /usr/local/lib/python3.7/dist-packages --hidden-import=torch --hidden-import=torchvision --onefile ${PYINSTALLER_ARGS} /opt/prj/embedded.adas.emergencybreaking/src/adas_emergencybreaking/__main__.py ; \
# Maybe we will need to add additional paths with -p ...
# Copy the runnable to our default location /opt/run/app
ONBUILD RUN mkdir -p /opt/run && \
cp -p -v /opt/prj/embedded.adas.emergencybreaking/dist/__main__ /opt/run/app
# Show linked libraries (as static linking does not work yet these have to be copied to the "run" image below)
#ONBUILD RUN get-linked-libs /usr/local/lib/libpython*.so.*
#ONBUILD RUN get-linked-libs /opt/run/app
# Add the executable and all linked libraries to the export/install directory
# so that they will be copied to the final "run" image
ONBUILD RUN export-install $( get-linked-libs /opt/run/app )
# Show what we have produced
ONBUILD RUN find /opt/export -type f
The requirements.txt, which is used to install my dependencies looks like this:
numpy
tensorflow-cpu
matplotlib
--find-links https://download.pytorch.org/whl/torch_stable.html
torch==1.11.0+cpu
--find-links https://download.pytorch.org/whl/torch_stable.html
torchvision==0.12.0+cpu
Is there anything obviously wrong here?
I built a ubuntu image using the following Dockerfile:
FROM ubuntu:20.04
# Disable Prompt During Packages Installation
ARG DEBIAN_FRONTEND=noninteractive
# Add 32bit architecture
RUN dpkg --add-architecture i386 \
&& apt-get update \
&& apt-get install -y libc6:i386 libncurses5:i386 libstdc++6:i386 zlib1g:i386
RUN apt-get update && apt-get install -y locales && rm -rf /var/lib/apt/lists/* \
&& localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8
ENV LANG en_US.utf8
RUN apt-get update && apt-get install -y \
iputils-ping \
python3 python3-pip
# Copy app to container
COPY . /app
WORKDIR /app
# Install pip requirements
COPY requirements.txt /app
RUN python3 -m pip install -r requirements.txt
# During debugging, this entry point will be overridden. For more information, please refer to https://aka.ms/vscode-docker-python-debug
CMD ["bash"]
I've been trying to run a 32bit app (hence the first run command in the Dockerfile) I have inside the my_app directory using:
./app
but I keep getting
bash: ./app: No such file or directory
I build your docker file with no error, do you have more detail ?
I have tested the internet connection by wget https://www.google.com and it worked from inside the docker. But, when I run a headless firefox with selenium python binding, selenium throughs the TimeoutException:
>> docker run myselcontainer
Traceback (most recent call last):
File "run.py", line 24, in <module>
driver = webdriver.Firefox(service_log_path=os.devnull, options=options, capabilities=capabilities, firefox_profile=profile)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__
RemoteWebDriver.__init__(
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__
self.start_session(capabilities, browser_profile)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: Connection refused (os error 111)
But when I run the same python file from my host, it runs completely fine.
(Kindly do not suggest me to use docker-selenium image. I have reasons to not to use them. Except changing the base image, any suggestion or query is welcome.)
Below is the run.py:
from selenium import webdriver
import os
# Set proper profile
profile = webdriver.FirefoxProfile()
profile.set_preference("security.fileuri.strict_origin_policy", False) # disable Strict Origin Policy
profile.set_preference("dom.webdriver.enabled", False) # disable Strict Origin Policy
# Capabilities
capabilities = webdriver.DesiredCapabilities.FIREFOX
capabilities['marionette'] = True
# Options
options = webdriver.FirefoxOptions()
options.add_argument("--log-level=OFF")
# Using non Headless for debugging
options.headless = True
driver = webdriver.Firefox(service_log_path=os.devnull, options=options, capabilities=capabilities, firefox_profile=profile)
driver.set_window_size(1920, 1080)
driver.get('https://www.google.com')
print(driver.page_source)
driver.quit()
And below is the Dockerfile:
FROM python:3.8-slim-buster
# Python optimization
## Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE 1
## Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED 1
# Locales
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV DEBIAN_FRONTEND=noninteractive
# Installation required for selenium
RUN apt-get update -y \
&& apt-get install --no-install-recommends --no-install-suggests -y tzdata ca-certificates bzip2 curl wget libc-dev libxt6 \
&& apt-get install --no-install-recommends --no-install-suggests -y `apt-cache depends firefox-esr | awk '/Depends:/{print$2}'` \
&& update-ca-certificates \
# Cleanup unnecessary stuff
&& apt-get purge -y --auto-remove \
-o APT::AutoRemove::RecommendsImportant=false \
&& rm -rf /var/lib/apt/lists/* /tmp/*
# install geckodriver
RUN GECKODRIVER_VERSION=`curl https://github.com/mozilla/geckodriver/releases/latest | grep -Po 'v[0-9]+.[0-9]+.[0-9]+'` && \
wget https://github.com/mozilla/geckodriver/releases/download/$GECKODRIVER_VERSION/geckodriver-$GECKODRIVER_VERSION-linux64.tar.gz && \
tar -zxf geckodriver-$GECKODRIVER_VERSION-linux64.tar.gz -C /usr/local/bin && \
chmod +x /usr/local/bin/geckodriver && \
rm geckodriver-$GECKODRIVER_VERSION-linux64.tar.gz
# install firefox
RUN FIREFOX_SETUP=firefox-setup.tar.bz2 && \
wget -O $FIREFOX_SETUP "https://download.mozilla.org/?product=firefox-latest&os=linux64" && \
tar xjf $FIREFOX_SETUP -C /opt/ && \
ln -s /opt/firefox/firefox /usr/bin/firefox && \
rm $FIREFOX_SETUP
# Install pip requirements
RUN python -m pip install --upgrade pip && python -m pip install --no-cache-dir selenium scrapy
ENV APP_HOME /usr/src/app
WORKDIR /$APP_HOME
COPY . $APP_HOME/
RUN export PYTHONPATH=$PYTHONPATH:$APP_HOME
# Switching to a non-root user, please refer to https://aka.ms/vscode-docker-python-user-rights
RUN useradd appuser && chown -R appuser $APP_HOME
USER appuser
CMD [ "python3", "run.py" ]
My container build and run commands are :
docker build -t myselcontainer .
docker run myselcontainer