Docker: How to create a table for local dynamo DB? - docker

I'm trying to create a docker container with local Amazon Dynamo DB. And it actually works. But I can not understand how to create a table for this image in Docker file?
Through the javascript I create a table like this:
var params = {
TableName: 'UserActivity',
KeySchema: [
{
AttributeName: 'user_id',
KeyType: 'HASH',
},
{
AttributeName: 'user_action',
KeyType: 'RANGE',
}
],
AttributeDefinitions: [
{
AttributeName: 'user_id',
AttributeType: 'S',
},
{
AttributeName: 'user_action',
AttributeType: 'S',
}
],
ProvisionedThroughput: {
ReadCapacityUnits: 2,
WriteCapacityUnits: 2,
}
};
dynamodb.createTable(params, function(err, data) {
if (err) ppJson(err); // an error occurred
else ppJson(data); // successful response
});
And here is my Docker file:
FROM openjdk:8-jre-alpine
ENV DYNAMODB_VERSION=latest
RUN apk add --update curl && \
rm -rf /var/cache/apk/* && \
curl -O https://s3-us-west-2.amazonaws.com/dynamodb-local/dynamodb_local_${DYNAMODB_VERSION}.tar.gz && \
tar zxvf dynamodb_local_${DYNAMODB_VERSION}.tar.gz && \
rm dynamodb_local_${DYNAMODB_VERSION}.tar.gz
EXPOSE 8000
ENTRYPOINT ["java", "-Djava.library.path=.", "-jar", "DynamoDBLocal.jar", "--sharedDb", "-inMemory", "-port", "8000"]

You need to point your DynamoDB client to the local DynamoDB endpoint.
Do something like:
dynamodb = new AWS.DynamoDB({
endpoint: new AWS.Endpoint('http://localhost:8000')
});
dynamodb.createTable(...);
UPDATE
Alternatively you can use AWS CLI to create tables in your local DynamoDB without using JavaSciprt code. You need to use the "--endpoint-url" to point CLI to your local instance like this:
aws dynamodb list-tables --endpoint-url http://localhost:8000
You need to use create-table command to create a table.

The main problem was in base image which I used for local dynamodb. I mean openjdk:8-jre-alpine
Looks like the alpine distribution does not have something important for functioning of local dynamodb.
So here is a working Dockerfile for local dynamodb:
FROM openjdk:8-jre
ENV DYNAMODB_VERSION=latest
COPY .aws/ root/.aws/
RUN curl -O https://s3-us-west-2.amazonaws.com/dynamodb-local/dynamodb_local_${DYNAMODB_VERSION}.tar.gz && \
tar zxvf dynamodb_local_${DYNAMODB_VERSION}.tar.gz && \
rm dynamodb_local_${DYNAMODB_VERSION}.tar.gz
EXPOSE 8000
ENTRYPOINT ["java", "-Djava.library.path=.", "-jar", "DynamoDBLocal.jar", "--sharedDb", "-inMemory"]
I'd be happy to read what is the main problem of alpine in this particular case.

Related

AWS ECS EC2 ECR not updating files after deployment with docker volume nginx

I am being stuck on issue with my volume and ECS.
I would like to attach volume so i can store there .env files etc so i dont have to recreate this manually after every deployment.
The problem is, the way I have it set up it does not update(or overwrite) files, which are pushed to ECR. So If i do code change and push it to git, it does following:
Creates new image and pushes it to ECR
It Creates new containers with image pushed to ECR (it dynamically assigns tag to the image)
when I do docker ps on EC2 I see new containers, and container with code changes is built from correct image which has just been pushed to ECR. So it seems all is working fine until this point.
But the code changes dont appear when i refresh browser nor after clearing caches.
I am attaching volume to the folder /var/www/html where sits my app, so from my understanding this code should get replaced during deployment. But the problem is, it does not replaces the code.
When I remove the volume, I can see the code changes everytime deployment finishes but I also always have to create manually .env file + run couple of commands.
PS: I have another container (mysql) which is setting volume exactly the same way and changes I do in database are persistent even after new container is created.
Please see my Docker file and taskDefinition.json to see how I deal with volumes.
Dockerfile:
FROM alpine:${ALPINE_VERSION}
# Setup document root
WORKDIR /var/www/html
# Install packages and remove default server definition
RUN apk add --no-cache \
curl \
nginx \
php8 \
php8-ctype \
php8-curl \
php8-dom \
php8-fpm \
php8-gd \
php8-intl \
php8-json \
php8-mbstring \
php8-mysqli \
php8-pdo \
php8-opcache \
php8-openssl \
php8-phar \
php8-session \
php8-xml \
php8-xmlreader \
php8-zlib \
php8-tokenizer \
php8-fileinfo \
php8-json \
php8-xml \
php8-xmlwriter \
php8-simplexml \
php8-dom \
php8-pdo_mysql \
php8-pdo_sqlite \
php8-tokenizer \
php8-pecl-redis \
php8-bcmath \
php8-exif \
supervisor \
nano \
sudo
# Create symlink so programs depending on `php` still function
RUN ln -s /usr/bin/php8 /usr/bin/php
# Configure nginx
COPY tools/docker/config/nginx.conf /etc/nginx/nginx.conf
# Configure PHP-FPM
COPY tools/docker/config/fpm-pool.conf /etc/php8/php-fpm.d/www.conf
COPY tools/docker/config/php.ini /etc/php8/conf.d/custom.ini
# Configure supervisord
COPY tools/docker/config/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# Make sure files/folders needed by the processes are accessable when they run under the nobody user
RUN chown -R nobody.nobody /var/www/html /run /var/lib/nginx /var/log/nginx
# Install composer
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
RUN apk update && apk add bash
# Install node npm
RUN apk add --update nodejs npm \
&& npm config set --global loglevel warn \
&& npm install --global marked \
&& npm install --global node-gyp \
&& npm install --global yarn \
# Install node-sass's linux bindings
&& npm rebuild node-sass
# Switch to use a non-root user from here on
USER nobody
# Add application
COPY --chown=nobody ./ /var/www/html/
RUN cat /var/www/html/resources/js/Components/Sections/About.vue
RUN composer install --optimize-autoloader --no-interaction --no-progress --ignore-platform-req=ext-zip --ignore-platform-req=ext-zip
USER root
RUN yarn && yarn run production
USER nobody
VOLUME /var/www/html
# Expose the port nginx is reachable on
EXPOSE 8080
# Let supervisord start nginx & php-fpm
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]
# Configure a healthcheck to validate that everything is up&running
HEALTHCHECK --timeout=10s CMD curl --silent --fail http://127.0.0.1:8080/fpm-ping
taskDefinition.json
{
"containerDefinitions": [
{
"name": "fooweb-nginx-php",
"cpu": 100,
"memory": 512,
"links": [
"mysql"
],
"portMappings": [
{
"containerPort": 8080,
"hostPort": 80,
"protocol": "tcp"
}
],
"essential": true,
"environment": [],
"mountPoints": [
{
"sourceVolume": "fooweb-storage-web",
"containerPath": "/var/www/html"
}
]
},
{
"name": "mysql",
"image": "mysql",
"cpu": 50,
"memory": 512,
"portMappings": [
{
"containerPort": 3306,
"hostPort": 4306,
"protocol": "tcp"
}
],
"essential": true,
"environment": [
{
"name": "MYSQL_DATABASE",
"value": "123"
},
{
"name": "MYSQL_PASSWORD",
"value": "123"
},
{
"name": "MYSQL_USER",
"value": "123"
},
{
"name": "MYSQL_ROOT_PASSWORD",
"value": "123"
}
],
"mountPoints": [
{
"sourceVolume": "fooweb-storage-mysql",
"containerPath": "/var/lib/mysql"
}
]
}
],
"family": "art_web_task_definition",
"taskRoleArn": "arn:aws:iam::123:role/ecs-task-execution-role",
"executionRoleArn": "arn:aws:iam::123:role/ecs-task-execution-role",
"networkMode": "bridge",
"volumes": [
{
"name": "fooweb-storage-mysql",
"dockerVolumeConfiguration": {
"scope": "shared",
"autoprovision": true,
"driver": "local"
}
},
{
"name": "fooweb-storage-web",
"dockerVolumeConfiguration": {
"scope": "shared",
"autoprovision": true,
"driver": "local"
}
}
],
"placementConstraints": [],
"requiresCompatibilities": [
"EC2"
],
"cpu": "1536",
"memory": "1536",
"tags": []
}
So I believe there will be some problem with the way I have set the volume or maybe there could be some permission issue ?
Many thanks !
"I am attaching volume to the folder /var/www/html where sits my app,
so from my understanding this code should get replaced during
deployment."
That's the opposite of how docker volumes work.
It is going to ignore anything in /var/www/html inside the docker image, and instead reuse whatever you have in the mounted volume. Mounted volumes are primarily for persisting files between container restarts and image changes. If there is updated code in /var/www/html inside the image you are building, and you want that updated code to be active when your application is deployed, then you can't mount that as a volume.
If you are specifying a VOLUME instruction in your Dockerfile, then the very first time you ran your container in it would have "initialized" the volume with the files that are inside the docker container, as part of the process of creating the volume. After that, the files in the volume on the host server are persisted across container restarts/deployments, and any new updates to that path inside the new docker images are ignored.

How do I get a Command to run from a Dockerfile.aws.json on Elastic Beanstalk?

I have a Dockerfile and a Dockerfile.aws.json:
{
"AWSEBDockerrunVersion": "1",
"Ports": [{
"ContainerPort": "5000",
"HostPort": "5000"
}],
"Volumes": [{
"HostDirectory": "/tmp/download/models",
"ContainerDirectory": "/models"
}],
"Logging": "/var/log/nginx",
"Command": "mkdir -p /tmp && axel https://example.com/models.zip -o /tmp/models.zip"
}
But when I deploy, it doesn't run the Command that I specified. What am I doing wrong?
If you have ENTRYPOINT in your Dockerfile, than the Command gets appended as its arguments:
Specify a command to execute in the container. If you specify an Entrypoint, then Command is added as an argument to Entrypoint. For more information, see CMD in the Docker documentation.
Thus your Command mkdir -p /tmp ... will be used as an argument to python3 -m flask run --host=0.0.0.0, resulting in error. This could explain why you experience issue.
I tried to recreate the issue initially using your Command structure but had some problems. What worked was using Command in the following way:
"Command": "/bin/bash -c \"mkdir -p /tmp && axel https://example.com/models.zip -o /tmp/models.zip\""
My Dockerfile did not have Entrypoint. Thus to run your python you could maybe do the following (assuming everything else is correct):
"Command": "/bin/bash -c \"mkdir -p /tmp && axel https://example.com/models.zip -o /tmp/models.zip && python3 -m flask run --host=0.0.0.0\""
Do you have the Dockerfile content?
It most likely they your ENTRYPOINT script does not receive parameters, or it is ignoring it.
What you can do is something similar to this.
You have an entrypoint script that receive the command passed in aws.json as parameter, execute it and then call your real python command.
Or you can replace your ENTRYPOINT by something similar to this:
ENTRYPOINT ["/bin/bash"]
and your default command will be:
CMD ["python3 ..."]
This way when running locally you only run the python3 command.
When running in aws, you can change your Command and append the python to the end, as mentioned by Marcin. Both cases works

OpenFaaS serve model using Tensorflow serving

I'd like to serve Tensorfow Model by using OpenFaaS. Basically, I'd like to invoke the "serve" function in such a way that tensorflow serving is going to expose my model.
OpenFaaS is running correctly on Kubernetes and I am able to invoke functions via curl or from the UI.
I used the incubator-flask as example, but I keep receiving 502 Bad Gateway all the time.
The OpenFaaS project looks like the following
serve/
- Dockerfile
stack.yaml
The inner Dockerfile is the following
FROM tensorflow/serving
RUN mkdir -p /home/app
RUN apt-get update \
&& apt-get install curl -yy
RUN echo "Pulling watchdog binary from Github." \
&& curl -sSLf https://github.com/openfaas-incubator/of-watchdog/releases/download/0.4.6/of-watchdog > /usr/bin/fwatchdog \
&& chmod +x /usr/bin/fwatchdog
WORKDIR /root/
# remove unecessery logs from S3
ENV TF_CPP_MIN_LOG_LEVEL=3
ENV AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
ENV AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
ENV AWS_REGION=${AWS_REGION}
ENV S3_ENDPOINT=${S3_ENDPOINT}
ENV fprocess="tensorflow_model_server --rest_api_port=8501 \
--model_name=${MODEL_NAME} \
--model_base_path=${MODEL_BASE_PATH}"
# Set to true to see request in function logs
ENV write_debug="true"
ENV cgi_headers="true"
ENV mode="http"
ENV upstream_url="http://127.0.0.1:8501"
# gRPC tensorflow serving
# EXPOSE 8500
# REST tensorflow serving
# EXPOSE 8501
RUN touch /tmp/.lock
HEALTHCHECK --interval=5s CMD [ -e /tmp/.lock ] || exit 1
CMD [ "fwatchdog" ]
the stack.yaml file looks like the following
provider:
name: faas
gateway: https://gateway-url:8080
functions:
serve:
lang: dockerfile
handler: ./serve
image: repo/serve-model:latest
imagePullPolicy: always
I build the image with faas-cli build -f stack.yaml and then I push it to my docker registry with faas-cli push -f stack.yaml.
When I execute faas-cli deploy -f stack.yaml -e AWS_ACCESS_KEY_ID=... I get Accepted 202 and it appears correctly among my functions. Now, I want to invoke the tensorflow serving on the model I specified in my ENV.
The way I try to make it work is to use curl in this way
curl -d '{"inputs": ["1.0", "2.0", "5.0"]}' -X POST https://gateway-url:8080/function/deploy-model/v1/models/mnist:predict
but I always obtain 502 Bad Gateway.
Does anybody have experience with OpenFaaS and Tensorflow Serving? Thanks in advance
P.S.
If I run tensorflow serving without of-watchdog (basically without the openfaas stuff), the model is served correctly.
Elaborating the link mentioned by #viveksyngh.
tensorflow-serving-openfaas:
Example of packaging TensorFlow Serving with OpenFaaS to be deployed and managed through OpenFaaS with auto-scaling, scale-from-zero and a sane configuration for Kubernetes.
This example was adapted from: https://www.tensorflow.org/serving
Pre-reqs:
OpenFaaS
OpenFaaS CLI
Docker
Instructions:
Clone the repo
$ mkdir -p ~/dev/
$ cd ~/dev/
$ git clone https://github.com/alexellis/tensorflow-serving-openfaas
Clone the sample model and copy it to the function's build context
$ cd ~/dev/tensorflow-serving-openfaas
$ git clone https://github.com/tensorflow/serving
$ cp -r serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu ./ts-serve/saved_model_half_plus_two_cpu
Edit the Docker Hub username
You need to edit the stack.yml file and replace alexellis2 with your Docker Hub account.
Build the function image
$ faas-cli build
You should now have a Docker image in your local library which you can deploy to a cluster with faas-cli up
Test the function locally
All OpenFaaS images can be run stand-alone without OpenFaaS installed, let's do a quick test, but replace alexellis2 with your own name.
$ docker run -p 8081:8080 -ti alexellis2/ts-serve:latest
Now in another terminal:
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://127.0.0.1:8081/v1/models/half_plus_two:predict
{
"predictions": [2.5, 3.0, 4.5
]
}
From here you can run faas-cli up and then invoke your function from the OpenFaaS UI, CLI or REST API.
$ export OPENFAAS_URL=http://127.0.0.1:8080
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' $OPENFAAS_URL/function/ts-serve/v1/models/half_plus_two:predict
{
"predictions": [2.5, 3.0, 4.5
]
}

Accessing `docker run` arguments from the Dockerfile

I run an image like that:
docker run <image_name> <config_file>
where config_file is the path to a JSON file which contains the configuration of my application.
Inside the Dockerfile, I do
ENTRYPOINT ["uwsgi", \
"--log-encoder", "json {\"msg\":\"${msg}\"}\n", \
"--http", ":80", \
"--master", \
"--wsgi-file", "app.py", \
"--callable", "app", \
"--threads", "10", \
"--pyargv"]
At the same time, I would like to access some of the values in the configuration file in the Dockerfile. For example to configure the JSON log encoder of uWSGI.
How can I do that?
This is impossible. See comment by #David Maze.

Passing Google service account credentials to Docker

My use case is a little different than others with this problem, so a little up-front description:
I am working on Google Cloud and have a "dockerized" Django app. Part of the app depends on using gsutil for moving files to/from a Google Storage bucket. For various reasons, we do not want to use Google Container Engine to manage our containers. Rather, we would like to scale horizontally by starting additional Google Compute VMs which will, in turn, run this Docker container. Similar to https://cloud.google.com/python/tutorials/bookshelf-on-compute-engine except we will use a container rather than pulling a git repository.
The VM's will be built from a basic Debian image, and the startup and installation of dependencies (e.g. Docker itself) will be orchestrated with a startup script (e.g. gcloud compute instances create some-instance --metadata-from-file startup-script=/path/to/startup.sh).
If I manually create a VM, elevate with sudo -s, run gsutil config -f (which creates a credential file at /root/.boto) and then run my docker container (see Dockerfile below) with
docker run -v /root/.boto:/root/.boto username/gs gsutil ls gs://my-test-bucket
then it works. However, that requires my interaction to create the boto file.
My question is: How can I pass the default service credentials to the Docker container that will be starting in that new VM?
gsutil works out of the box on even a "fresh" Debian VM since it is using the default compute engine credentials that all VMs are loaded with. Is there a way to use those credentials and pass them to the docker container? After the first call to gsutil on a fresh VM, I've noticed that it creates ~/.gsutil and ~/.config folders. Unfortunately, mounting both of those in Docker with
docker run -v ~/.config/:/root/.config -v ~/.gsutil:/root/.gsutil username/gs gsutil ls gs://my-test-bucket
does not fix my problem. It tells me:
ServiceException: 401 Anonymous users does not have storage.objects.list access to bucket my-test-bucket.
A minimal gsutil Dockerfile (not mine):
FROM alpine
#install deps and install gsutil
RUN apk add --update \
python \
py-pip \
py-cffi \
py-cryptography \
&& pip install --upgrade pip \
&& apk add --virtual build-deps \
gcc \
libffi-dev \
python-dev \
linux-headers \
musl-dev \
openssl-dev \
&& pip install gsutil \
&& apk del build-deps \
&& rm -rf /var/cache/apk/*
CMD ["gsutil"]
Addition: a workaround:
I have since solved my issue, but it is quite roundabout so I'm still interested in a simpler way, if possible. All the details are below:
First, a description:
I first created a service account in the web console. I then save the JSON keyfile (call it credentials.json) into a storage bucket. In the startup script for the GCE VM, I copy that keyfile to the local filesystem (gsutil cp gs://<bucket>/credentials.json /gs_credentials/). I then start my docker container, mounting that local directory. Then, as the docker container starts, it runs a script that authenticates the credentials.json (which creates a .boto file inside the docker), export BOTO_PATH=, and finally I can perform gsutil operations in the Docker container.
Here are the files for a small working example:
Dockerfile:
FROM alpine
#install deps and install gsutil
RUN apk add --update \
python \
py-pip \
py-cffi \
py-cryptography \
bash \
curl \
&& pip install --upgrade pip \
&& apk add --virtual build-deps \
gcc \
libffi-dev \
python-dev \
linux-headers \
musl-dev \
openssl-dev \
&& pip install gsutil \
&& apk del build-deps \
&& rm -rf /var/cache/apk/*
# install the gcloud SDK-
# this allows us to use gcloud auth inside the container
RUN curl -sSL https://sdk.cloud.google.com > /tmp/gcl \
&& bash /tmp/gcl --install-dir=~/gcloud --disable-prompts
RUN mkdir /startup
ADD gsutil_docker_startup.sh /startup/gsutil_docker_startup.sh
ADD get_account_name.py /startup/get_account_name.py
ENTRYPOINT ["/startup/gsutil_docker_startup.sh"]
gsutil_docker_startup.sh: Takes a single argument, which is the path to a JSON-format service account credentials file. The file exists because the directory on the host machine was mounted in the container.
#!/bin/bash
CRED_FILE_PATH=$1
mkdir /results
# List the bucket, see that it gives a "ServiceException:401"
gsutil ls gs://<input bucket> > /results/before.txt
# authenticate the credentials- this creates a .boto file:
/root/gcloud/google-cloud-sdk/bin/gcloud auth activate-service-account --key-file=$CRED_FILE_PATH
# need to extract the service account which is like:
# <service acct ID>#<google project>.iam.gserviceaccount.com"
SERVICE_ACCOUNT=$(python /startup/get_account_name.py $CRED_FILE_PATH)
# with that service account, we can locate the .boto file:
export BOTO_PATH=/root/.config/gcloud/legacy_credentials/$SERVICE_ACCOUNT/.boto
# List the bucket and copy the file to an output bucket for good measure
gsutil ls gs://<input bucket> > /results/after.txt
gsutil cp /results/*.txt gs://<output bucket>/
get_account_name.py:
import json
import sys
j = json.load(open(sys.argv[1]))
sys.stdout.write(j['client_email'])
Then, the GCE startup script (executed automatically as the VM is started) is:
#!/bin/bash
# <SNIP>
# Install docker, other dependencies
# </SNIP>
# pull docker image
docker pull userName/containerName
# get credential file:
mkdir /cloud_credentials
gsutil cp gs://<bucket>/credentials.json /cloud_credentials/creds.json
# run container
# mount the host machine directory where the credentials were saved.
# Note that the container expects a single arg,
# which is the path to the credential file IN THE CONTAINER
docker run -v /cloud_credentials:/cloud_credentials \
userName/containerName /cloud_credentials/creds.json
You can assign a Specific Service Account to your instance and then use the Application Default Credential in your code. Please verify these points before testing.
Set the instance access scopes to: "Allow full access to all Cloud APIs" as they are not really a security feature
Set the right role to you service account: "Storage Object Viewer"
Authentication Token are retrieved automatically by Application Default Credential via Google Metadata Server which is available from your instance and your Docker containers as well. There is no need to manage any credentials.
def implicit():
from google.cloud import storage
# If you don't specify credentials when constructing the client, the
# client library will look for credentials in the environment.
storage_client = storage.Client()
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
I also quickly tested with docker and I worked perfectly
yann#test:~$ gsutil cat gs://my-test-bucket/hw.txt
Hello World
yann#test:~$ docker run --rm google/cloud-sdk gsutil cat gs://my-test-bucket/hw.txt
Hello World

Resources