poetry packaging isn't finding private libraries - google-cloud-dataflow

I have dataflow Job which has code base as below which throws error finding custom library not defined
I have folder structure as below:
app-pipeline
├── app-pipeline
│ └── pipeline.py
│ └── config.py
│ └── helper.py
│ └── setup.py
├── pyproject.toml
└── poetry.lock
Docker file is as below:
FROM gcr.io/dataflow-templates-base/python3-template-launcher-base
ENV PYTHONUNBUFFERED True
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./
RUN gcloud auth activate-service-account --key-file=key.json
RUN pip install --no-cache-dir --upgrade pip wheel setuptools
RUN pip install poetry==1.2.1
RUN poetry self add "keyrings.google-artifactregistry-auth=^1" -v
RUN poetry config virtualenvs.create false
RUN poetry install -vvvvvvvvvvvv
# Entry Point for Dataflow Job:
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="./app_pipeline/events_pipeline.py"
ENV FLEX_TEMPLATE_PYTHON_SETUP_FILE="./app_pipeline/setup.py"
#ENV FLEX_TEMPLATE_PYTHON_CONFIG_FILE=
After dataflow application runs, it is not finding the private libraries which were installed during poetry install and it says:
library not defined
I copied all folders from which install private libraries installation to app-pipeline folder and it works fine. private libraries are installed in image at usr/local/bin/python3.7/site-packages/
setup.py
REQUIRED_PACKAGES = [
'numpy',
]
setuptools.setup(
name='app',
version='0.0.1',
description='api workflow package.',
install_requires=REQUIRED_PACKAGES,
packages=setuptools.find_packages(),
cmdclass={
# Command class instantiated and run during pip install scenarios.
'build': build,
'CustomCommands': CustomCommands,
})
Looks like I am missing something here.

Related

Building Go project from Dockerfile says package not in GOROOT

This is the Dockerfile I have:
FROM golang:1.13 AS go-build-env
FROM utilities:debug
# Build go binary
RUN apk add gcc go
ENV GO111MODULE=on \
CGO_ENABLED=1 \
GOOS=linux \
GOARCH=amd64
COPY myApp/* myApp/
RUN cd myApp && go build && mv ./myApp ../
CMD bash
I am getting the following error when it is doing go build:
> [stage-1 5/8] RUN cd myApp && go build && mv ./myApp ../:
#12 0.751 found packages main (main.go) and codec (zwrapper.go) in /myApp
#12 0.751 main.go:17:2: package codec is not in GOROOT (/usr/lib/go/src/codec)
My GoApp structure is:
myApp
├── codec
│   └── zwrapper.go
├── go.mod
├── go.sum
├── main.go
└── vendor
├── codec
│   └── zwrapper.go
├── github.com
...
When I run go build from myApp directory manually it works, but when building the Dockerfile, it keeps complaining. Do I have to setup a GOPATH and GOROOT in the Dockerfile? Any ideas for how to get around this?
Go has a system variable set location for the build location (which is strict). This is either your home directory +go/src/ or the GOPATH.
In your case you have to set your GOPATH:
ENV GOPATH /myApp

Python.h: No such file or directory on Amazon Linux Lambda Container

I am trying to build this public.ecr.aws/lambda/python:3.6 based Dockerfile with a requirements.txt file that contains some libraries that need gcc/g++ to build. I'm getting an error of a missing Python.h file despite the fact that I installed the python development package and /usr/include/python3.6m/Python.h exists in the file system.
Dockerfile
FROM public.ecr.aws/lambda/python:3.6
RUN yum install -y gcc gcc-c++ python36-devel.x86_64
RUN pip install --upgrade pip && \
pip install cyquant
COPY app.py ./
CMD ["app.handler"]
When I build this with
docker build -t redux .
I get the following error
cyquant/dimensions.cpp:4:20: fatal error: Python.h: No such file or directory
#include "Python.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1
Notice, however, that my Dockerfile yum installs the development package. I have also tried the yum package python36-devel.i686 with no change.
What am I doing wrong?
The pip that you're executing lives in /var/lang/bin/pip whereas the python you're installing lives in the /usr prefix
presumably you could use /usr/bin/pip directly to install, but I'm not sure whether that works correctly with the lambda environment
I was able to duplicate the behavior of the AWS Lambda functionality without their Docker image and it works just fine. This is the Dockerfile I am using.
ARG FUNCTION_DIR="/function/"
FROM python:3.6 AS build
ARG FUNCTION_DIR
ARG NETRC_PATH
RUN echo "${NETRC_PATH}" > /root/.netrc
RUN mkdir -p ${FUNCTION_DIR}
COPY requirements.txt ${FUNCTION_DIR}
WORKDIR ${FUNCTION_DIR}
RUN pip install --upgrade pip && \
pip install --target ${FUNCTION_DIR} awslambdaric && \
pip install --target ${FUNCTION_DIR} --no-warn-script-location -r requirements.txt
FROM python:3.6
ARG FUNCTION_DIR
WORKDIR ${FUNCTION_DIR}
COPY --from=build ${FUNCTION_DIR} ${FUNCTION_DIR}
COPY main.py ${FUNCTION_DIR}
ENV MPLCONFIGDIR=/tmp/mplconfig
ENTRYPOINT ["/usr/local/bin/python", "-m", "awslambdaric"]
CMD ["main.handler"]

COPY files - next to Dockerfile - don't work and block docker build

I'm trying to building a docker container with mlflow server inside, with poetry toml file for dependency.(the two toml are exactly the same, it was just a way to try to figure out)
tree:
├── docker-entrypoint.sh
├── Dockerfile
├── files
│ └── pyproject.toml
├── git.sh
├── pyproject.toml
└── README.md
as you can see, my toml file is next to Dockerfile COPY pyproject.toml ./ don't work nevertheless
Dockerfile
FROM python:3.6.10-alpine3.10 as base
LABEL maintainer=""
ENV PYTHONFAULTHANDLER 1
ENV PYTHONHASHSEED random
ENV PYTHONUNBUFFERED 1
ENV MLFLOW_HOME ./
ENV SERVER_PORT 5000
ENV MLFLOW_VERSION 0.7.0
ENV SERVER_HOST 0.0.0.0
ENV FILE_STORE ${MLFLOW_HOME}/fileStore
ENV ARTIFACT_STORE ${MLFLOW_HOME}/artifactStore
ENV PIP_DEFAULT_TIMEOUT 100
ENV PIP_DISABLE_PIP_VERSION_CHECK on
ENV PIP_NO_CACHE_DIR off
ENV POETRY_VERSION 1.0.0
WORKDIR ${MLFLOW_HOME}
FROM base as builder
RUN apk update \
&& apk add --no-cache make gcc musl-dev python3-dev libffi-dev openssl-dev subversion
#download project file from github repo
RUN svn export https://github.com/MChrys/QuickSign/trunk/ \
&& pip install poetry==${POETRY_VERSION} \
&& mkdir -p ${FILE_STORE} \
&& mkdir -p ${ARTIFACT_STORE}\
&& python -m venv /venv
COPY pyproject.toml ./
RUN poetry export -f requirements.txt | /venv/bin/pip install -r --allow-root-install /dev/stdin
COPY . .
RUN poetry build && /venv/bin/pip install dist/*.whl
FROM base as final
RUN apk add --no-cache libffi libpq
COPY --from=builder /venv /venv
COPY docker-entrypoint.sh ./
EXPOSE $SERVER_PORT
VOLUME ["${FILE_STORE}", "${ARTIFACT_STORE}"]
CMD ["./docker-entrypoint.sh"]
the build command :
docker build - < Dockerfile
I get this error :
Step 21/32 : COPY pyproject.toml ./
COPY failed: stat /var/lib/docker/tmp/docker-builder335195979/pyproject.toml: no such file or directory
pyproject.toml
requires = ["poetry>=1.0.0", "mlflow>=0.7.0", "python>=3.6"]
build-backend = "poetry.masonry.api"
[tool.poetry]
name = "Sign"
description = ""
version = "1.0.0"
readme = "README.md"
authors = [
""
]
license = "MIT"
[tool.poetry.dependencies]
python = "3.6"
numpy = "1.14.3"
scipy = "*"
pandas = "0.22.0"
scikit-learn = "0.19.1"
cloudpickle = "*"
mlflow ="0.7.0"
tensorflow = "^2.0.0"
[tool.poetry.dev-dependencies]
pylint = "*"
docker-compose = "^1.25.0"
docker-image-size-limit = "^0.2.0"
tomlkit = "^0.5.8"
docker-entrypoint.sh
#!/bin/sh
set -e
. /venv/bin/activate
mlflow server \
--file-store $FILE_STORE \
--default-artifact-root $ARTIFACT_STORE \
--host $SERVER_HOST \
--port $SERVER_PORT
if i add RUN pwd; ls just befor the first COPY I obtain :
Step 20/31 : RUN pwd; ls
---> Running in e8ec36dd6ca8
/
artifactStore
bin
dev
etc
fileStore
home
lib
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
trunk
usr
var
venv
Removing intermediate container e8ec36dd6ca8
---> d7bba641bd7c
Step 21/31 : COPY pyproject.toml ./
COPY failed: stat /var/lib/docker/tmp/docker-builder392824737/pyproject.toml: no such file or directory
Try
docker build -t test .
instead of
docker build - < Dockerfile

Monolith docker application with webpack

I am running my monolith application in a docker container and k8s on GKE.
The application contains python & node dependencies also webpack for front end bundle.
We have implemented CI/CD which is taking around 5-6 min to build & deploy new version to k8s cluster.
Main goal is to reduce the build time as much possible. Written Dockerfile is multi stage.
Webpack is taking more time to generate the bundle.To buid docker image i am using already high config worker.
To reduce time i tried using the Kaniko builder.
Issue :
As docker cache layers for python code it's working perfectly. But when there is any changes in JS or CSS file we have to generate bundle.
When there is any changes in JS & CSS file instead if generate new bundle its use caching layer.
Is there any way to separate out build new bundle or use cache by passing some value to docker file.
Here is my docker file :
FROM python:3.5 AS python-build
WORKDIR /app
COPY requirements.txt ./
RUN pip install -r requirements.txt &&\
pip3 install Flask-JWT-Extended==3.20.0
ADD . /app
FROM node:10-alpine AS node-build
WORKDIR /app
COPY --from=python-build ./app/app/static/package.json app/static/
COPY --from=python-build ./app ./
WORKDIR /app/app/static
RUN npm cache verify && npm install && npm install -g --unsafe-perm node-sass && npm run sass && npm run build
FROM python:3.5-slim
COPY --from=python-build /root/.cache /root/.cache
WORKDIR /app
COPY --from=node-build ./app ./
RUN apt-get update -yq \
&& apt-get install curl -yq \
&& pip install -r requirements.txt
EXPOSE 9595
CMD python3 run.py
I would suggest to create separate build pipelines for your docker images, where you know that the requirements for npm and pip aren't so frequent.
This will incredibly improve the speed, reducing the time of access to npm and pip registries.
Use a private docker registry (the official one or something like VMWare harbor or SonaType Nexus OSS).
You store those build images on your registry and use them whenever something on the project changes.
Something like this:
First Docker Builder // python-builder:YOUR_TAG [gitrev, date, etc.)
docker build --no-cache -t python-builder:YOUR_TAG -f Dockerfile.python.build .
FROM python:3.5
WORKDIR /app
COPY requirements.txt ./
RUN pip install -r requirements.txt &&\
pip3 install Flask-JWT-Extended==3.20.0
Second Docker Builder // js-builder:YOUR_TAG [gitrev, date, etc.)
docker build --no-cache -t js-builder:YOUR_TAG -f Dockerfile.js.build .
FROM node:10-alpine
WORKDIR /app
COPY app/static/package.json /app/app/static
WORKDIR /app/app/static
RUN npm cache verify && npm install && npm install -g --unsafe-perm node-sass
Your Application Multi-stage build:
docker build --no-cache -t app_delivery:YOUR_TAG -f Dockerfile.app .
FROM python-builder:YOUR_TAG as python-build
# Nothing, already "stoned" in another build process
FROM js-builder:YOUR_TAG AS node-build
ADD ##### YOUR JS/CSS files only here, required from npm! ###
RUN npm run sass && npm run build
FROM python:3.5-slim
COPY . /app # your original clean app
COPY --from=python-build #### only the files installed with the pip command
WORKDIR /app
COPY --from=node-build ##### Only the generated files from npm here! ###
RUN apt-get update -yq \
&& apt-get install curl -yq \
&& pip install -r requirements.txt
EXPOSE 9595
CMD python3 run.py
A question is: why do you install curl and execute again the pip install -r requirements.txt command in the final docker image?
Triggering every time an apt-get update and install without cleaning the apt cache /var/cache/apt folder produces a bigger image.
As suggestion, use the docker build command with the option --no-cache to avoid caching result:
docker build --no-cache -t your_image:your_tag -f your_dockerfile .
Remarks:
You'll have 3 separate Dockerfiles, as I listed above.
Build the Docker images 1 and 2 only if you change your python-pip and node-npm requirements, otherwise keep them fixed for your project.
If any dependency requirement changes, then update the docker image involved and then the multistage one to point to the latest built image.
You should always build only the source code of your project (CSS, JS, python). In this way, you have also guaranteed reproducible builds.
To optimize your environment and copy files across the multi-stage builders, try to use virtualenv for python build.

Why can't composer find a composer.json file inside my Docker container?

I have a pretty straightforward Dockerfile based on Thelia's and Composer's. I want it to do as much setup as possible, so to that end, I am installing composer in the PATH within the container and then trying to run composer install. However, at that point, it seems my local/mounted files don't exist within the container yet (see output from RUN echo pwd: ... below).
The build fails with this error message:
Composer could not find a composer.json file in /var/www/html
To initialize a project, please create a composer.json file as described in the https://getcomposer.org/ "Getting Started" section
ERROR: Service 'web' failed to build: The command '/bin/sh -c composer install' returned a non-zero code: 1
Note that building without the RUN composer install instruction and then running docker-compose exec web composer install works.
Does the mount declared in docker-compose.yml not take effect until the image has been completely built? Do I need to explicitly COPY my local files for them to be visible during the build process?
Dockerfile:
FROM php:5.6-apache
COPY docker-php-pecl-install /usr/local/bin/
RUN apt-get update && apt-get install -y \
libfreetype6-dev \
libjpeg62-turbo-dev \
libmcrypt-dev \
libpng12-dev \
libicu-dev \
git \
zip \
libzip-dev \
&& docker-php-ext-install intl pdo_mysql mcrypt mbstring zip calendar \
&& docker-php-ext-configure gd --with-freetype-dir=/usr/include/ --with-jpeg-dir=/usr/include/ \
&& docker-php-ext-install gd \
&& docker-php-pecl-install xdebug-2.3.3
RUN a2enmod rewrite
RUN usermod -u 1000 www-data
COPY config/php.ini /usr/local/etc/php/
COPY config/vhost/vhost.conf /etc/apache2/sites-enabled/
# Expose webroot
VOLUME /var/www/html
WORKDIR /var/www/html
COPY . /var/www/html # Added at Edit 1
# Allow Composer to be run as root
ENV COMPOSER_ALLOW_SUPERUSER 1
# Setup the Composer installer
RUN curl -o /tmp/composer-setup.php https://getcomposer.org/installer \
&& curl -o /tmp/composer-setup.sig https://composer.github.io/installer.sig \
&& php -r "if (hash('SHA384', file_get_contents('/tmp/composer-setup.php')) !== trim(file_get_contents('/tmp/composer-setup.sig'))) { unlink('/tmp/composer-setup.php'); echo 'Invalid installer' . PHP_EOL; exit(1); }" \
&& php /tmp/composer-setup.php \
&& chmod a+x composer.phar \
&& mv composer.phar /usr/local/bin/composer
# Install composer dependencies
RUN echo pwd: `pwd` && echo ls: `ls` # outputs:
# pwd: /var/www/html
# ls:
RUN composer install
docker-compose.yml:
web:
build: ./docker/php
ports:
- "8002:80"
links:
- mariaDB
environment:
SYMFONY_ENV: dev
command: /usr/sbin/apache2ctl -D FOREGROUND
volumes:
- .:/var/www/html
mariaDB:
...
Edit 1 - after COPYing files explicitly
I added a COPY . /var/www/html after the VOLUME and WORKDIR instructions, and noticed something weird. Now the RUN pwd: ... instruction echoes:
pwd: /var/www/html
ls: Dockerfile config docker-php-pecl-install
This is the contents of my docker/php directory, not the project root!
Full(ish) directory structure:
.
├── LICENSE.txt
├── Readme.md
├── Thelia
├── bin
│   └── ...
├── bootstrap.php
├── cache
│   └── ...
├── change-version.sh
├── composer.json
├── composer.lock
├── docker
│   └── php
│   ├── Dockerfile
│   ├── config
│   └── docker-php-pecl-install
├── docker-compose.yml
├── lib
│   └── Thelia
│   └── Project
├── local
│   └── ...
├── log
├── templates
│   └── ...
├── vendor
│   └── ...
└── web
├── favicon.ico
├── index.php
├── index_dev.php
└── robots.txt
So my revised question is: how do I tell Docker to COPY from the context in which I run docker-compose? I thought this was the default behavior for docker-compose, and that's how it's worked for me on past projects...
You define a docker-compose 1.7 file with:
web:
build: ./docker/php
which results basically in:
docker build -f ./docker/php/Dockerfile ./docker/php/
So basically the build path is the context of the build process.
If you really want to define the context, you have to upgrade your docker-compose defintion to at least v2 or v3.
There you should be able to do:
version: '2'
services:
web:
build:
context: ./
dockerfile: ./docker/php/Dockerfile
I would advise against that as:
whenever one edits that Dockerfile, you always have to remember the parent's folder context structure, which will be quite hard to maintain
as the context grows ever larger, you will slow the build process by sending a huge amount of data into the container build's context. A .dockerignore file would help, but you would have to use it as a whitelist instead of a blacklist.
If you want to build multiple php containers that all install dependencies with your docker-php-pecl-install file, I advise you to not inject a god context, but that your rather create a base php image from its own Dockerfile, which you then reference in the in the other's containers Dockerfile as the FROM source. That base image can be an unrelated project, all you need it do to is built an image, and reused in different other php projects.
If you want to reuse that image in your entire company, it's helpful to setup a private docker registry, where you and your team can push and pull images from.

Resources