I would like to build a minio docker container for integration test purposes.
I would like to do the following in my Dockerfile.
Create the minio container
Create a test bucket
Copy a small amount of test data into a test bucket
Start the minio service
Test Data
./test-data/foo.txt
./test-data/bar.txt
FROM minio/minio
RUN mkdir -p /buckets/my-bucket
COPY test-data /buckets/my-bucket/test-data"
EXPOSE 9000 9001
CMD [ "minio", "server", "/buckets", "--address", ":9000", "--console-address", ":9001" ]
I know that I could run mc in a separate container to populate my bucket, but that requires a little bit of orchestration.
Is there a way that I could accomplish these steps in a Dockerfile?
A Dockerfile is just a collection of shell commands...so you can do pretty much anything you want. For example:
FROM docker.io/minio/minio:latest
COPY --from=docker.io/minio/mc:latest /usr/bin/mc /usr/bin/mc
RUN mkdir /buckets
RUN minio server /buckets & \
server_pid=$!; \
until mc alias set local http://localhost:9000 minioadmin minioadmin; do \
sleep 1; \
done; \
mc mb local/bucket1; \
echo this is file1 | mc pipe local/bucket1/file1; \
echo this is file2 | mc pipe local/bucket1/file2; \
kill $server_pid
CMD ["minio", "server", "/buckets", "--address", ":9000", "--console-address", ":9001"]
If we use the above Dockerfile to build an image named minio-demo, and then start a container like this:
$ docker run --rm -p 127.0.0.1:9000:9000 -p 127.0.0.1:9001:9001 minio-demo
We see:
$ mc alias set demo http://localhost:9000 minioadmin minioadmin
$ mc ls demo
[2022-07-07 22:01:35 EDT] 0B bucket1/
$ mc ls demo/bucket1
[2022-07-07 22:01:35 EDT] 14B STANDARD file1
[2022-07-07 22:01:35 EDT] 14B STANDARD file2
Related
I am trying to find a "global" solution for injecting an SSH key into a container. I know that there are several solutions including docker build kit and so on...but I don't want to build an image and inject the SSH key. I want to inject the SSH key by using an existing image with docker compose.
I use the following docker compose file:
version: '3.1'
services:
server1:
image: XXXXXXX
container_name: server1
command: bash -c "/root/init.sh && python3 /root/my_python.py"
environment:
- MANAGED_HOST=mserver
volumes:
- ./init.sh:/root/init.sh
secrets:
- id_rsa
secrets:
id_rsa:
file: /home/user/.ssh/id_rsa
The init.sh is as follows:
#!/bin/bash
eval "$(ssh-agent -s)" > /dev/null
if [ ! -d "/root/.ssh/" ]; then
mkdir /root/.ssh
ssh-keyscan $MANAGED_HOST > /root/.ssh/known_hosts
fi
ssh-add -k /run/secrets/id_rsa
If I run docker compose with the parameter command
bash -c "/root/init.sh && python3 /root/my_python.py", then the SSH authentication to the appropriate remote host ($MANAGED_HOST) is not working.
An agent process is running:
root 8 1 0 12:50 ? 00:00:00 ssh-agent -s
known_hosts is OK:
root#c67655d87ced:~# cat /root/.ssh/known_hosts
BLABLABLA ssh-rsa AAAAB3BLABLABLA....
and the agent is running, but the private key is not added:
root#c67655d87ced:~# ssh-add -l
Could not open a connection to your authentication agent.
Now, if I log in the container (docker exec -it server1 /bin/bash) and run the commands from init.sh one by one from the command line, then the SSH authentication to the appropriate remote host ($MANAGED_HOST) is working?!?
Any idea, how I can get it working by using the docker compose?
It should be enough to cause the file $HOME/.ssh/id_rsa to exist with appropriate permissions; you don't need an ssh agent running.
#!/bin/sh
if ! [ -d "$HOME/.ssh" ]; then
mkdir "$HOME/.ssh"
fi
chmod 0700 "$HOME/.ssh"
if [ -n "$MANAGED_HOST" ]; then
ssh-keyscan "$MANAGED_HOST" >> "$HOME/.ssh/known_hosts"
fi
if [ -f /run/secrets/id_rsa ]; then
cp /run/secrets/id_rsa "$HOME/.ssh/id_rsa"
chmod 0400 "$HOME/.ssh/id_rsa"
fi
# exec "$#"
A typical pattern is to use the Dockerfile ENTRYPOINT to do first-time setup tasks like this. That will get passed the CMD as arguments, and the commented exec "$#" line at the end of the file runs that as a command. You'd set this up in your image's Dockerfile like:
FROM XXXXXX
...
# Script must be executable on the host, and must start with a
# #!/bin/sh "shebang" line
COPY init.sh /root
# MUST use JSON-array form
ENTRYPOINT ["/root/init.sh"]
# Can use any Dockerfile syntax
CMD ["python3", "/root/my_python.py"]
In your specific example, you're launching init.sh as a subprocess. The ssh-agent setup sets some environment variables, like $SSH_AUTH_SOCK, but when these run as a subprocess they don't get propagated back out to the host process. You can use the standard POSIX shell . builtin (the bash source builtin is equivalent, but non-standard) to cause those environment variables to be set in the context of the parent shell:
command: sh -c ". /root/init.sh && exec python3 /root/my_python.py"
The exec replaces the shell wrapper with the Python script, which you generally want. This will also wind up being the parent process of ssh-agent, which could potentially surprise your process if it happens to exit.
I have a spring boot java application running on a docker container, and it tries to run a shell script. The shell script has a ssh command and I get the following error while running it
2020-08-12 09:22:29.425 INFO 1 --- [io-11013-exec-1] b.n.i.s.d.e.service.EmrManagerService : Executing spark submit, calling shell script: /tmp/temp843155675494688636.sh 172.29.199.15
2020-08-12 09:22:29.434 DEBUG 1 --- [io-11013-exec-1] b.n.i.s.d.e.service.EmrManagerService : Starting Input Stream:
2020-08-12 09:22:29.435 INFO 1 --- [io-11013-exec-1] b.n.i.s.d.e.service.EmrManagerService : #1 arg: 172.29.199.15
2020-08-12 09:22:29.436 INFO 1 --- [io-11013-exec-1] b.n.i.s.d.e.service.EmrManagerService : Exist Value127
2020-08-12 09:22:29.436 ERROR 1 --- [io-11013-exec-1] b.n.i.s.d.e.service.EmrManagerService : Starting Error Stream:
2020-08-12 09:22:29.436 ERROR 1 --- [io-11013-exec-1] b.n.i.s.d.e.service.EmrManagerService :
/tmp/temp843155675494688636.sh: line 5: ssh: not found
The same code works fine when am running the jar directly and not as docker container.
Is it something to do with ssh not recognized in docker container?
shell script -
#!/bin/bash
echo "#1 arg:" $1
ssh -i /home/dnaidaasd/aws-oneid-idaas-2020Q2.pem -oStrictHostKeyChecking=no hadoop#$1 '/etc/alternatives/jre/bin/java -Xmx1000m -server \
-XX:OnOutOfMemoryError="kill -9 %p" -cp "/usr/share/aws/emr/instance \
-controller/lib/*" -Dhadoop.log.dir=/mnt/var/log/hadoop/steps/s-100-120 \
-Dhadoop.log.file=syslog -Dhadoop.home.dir=/usr/lib/hadoop \
-Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=:/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native \
-Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true \
-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-14611-353/tmp \
-Dhadoop.security.logger=INFO,NullAppender \
-Dsun.net.inetaddr.ttl=30 \
org.apache.hadoop.util.RunJar /var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar spark-submit \
--conf spark.hadoop.mapred.output.compress=true \
--conf spark.hadoop.mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec \
--class biz.neustar.idaas.services.dataprofile.ProfileMain \
--name IdaasProfile --conf spark.dynamicAllocation.enabled=true \
--conf spark.executor.instances=2 --conf spark.driver.memory=8G \
--conf spark.executor.memory=4G --conf spark.executor.cores=1 \
--conf spark.sql.catalogImplementation=hive \
--jars s3://oneid-idaas-dev-us-east-1/dev/emr/TestIdaasProfile/spark-core_2.11-2.4.5.jar,s3://oneid-idaas-dev-us-east-1/dev/emr/TestIdaasProfile/spark-sql_2.11-2.4.5.jar,s3://oneid-idaas-dev-us-east-1/dev/emr/TestIdaasProfile/spark-mllib_2.11-2.4.5.jar,s3://oneid-idaas-dev-us-east-1/dev/emr/TestIdaasProfile/jackson-module-scala_2.11-2.6.7.1.jar,s3://oneid-idaas-dev-us-east-1/dev/emr/TestIdaasProfile/jackson-databind-2.6.7.jar s3://oneid-idaas-dev-us-east-1/dev/emr/TestIdaasProfile/data-profile-14.0.jar' \
$2 $3 $4
This shell script is called as -
public void executeSparkSubmit(String masterNodeIp, String pathToScript, String input_hive_table, String s3_output_path, String output_hive_table ) throws IOException, InterruptedException, DataProfileServiceException {
log.info("Executing spark submit, calling shell script: " + pathToScript + " " + masterNodeIp);
ProcessBuilder pb = new ProcessBuilder("sh", pathToScript, masterNodeIp, input_hive_table, s3_output_path, output_hive_table);
Process pr = pb.start();
And the Dockerfile contents are:
FROM openjdk:8-jdk-alpine
ADD ./data-profile-provider/build/libs/data-profile-provider-203.2.0-SNAPSHOT.jar data-profile.jar
EXPOSE 11013
ENTRYPOINT ["java", "-jar", "data-profile.jar", "application.properties"]
As I suspected - your image is Alpine-based and Alpine does not have SSH client installed by default.
Corrected Dockerfile:
FROM openjdk:8-jdk-alpine
RUN apk add --no-cache openssh-client
ADD ./data-profile-provider/build/libs/data-profile-provider-203.2.0-SNAPSHOT.jar data-profile.jar
EXPOSE 11013
ENTRYPOINT ["java", "-jar", "data-profile.jar", "application.properties"]
Edit: I forgot to add that Alpine does not have Bash either. Luckily your app invokes your script with sh scriptname.sh - otherwise you'd get bash: not found error.
SSH might not be installed.
My example here assumes an Ubuntu/Linux image derived from since you did not specify the Dockfile contents at the time.
If your container can launch successfully (ignore the fact that your app is failing), you can just simply run ssh on the command-line to see (it will give you something similar to command not found)
To run commands inside Docker container: Since an Ubuntu image has bash installed, you can run like this:
docker exec -ti containername bash
Inside Docker container: (One of my containers where there is no SSH installed)
ssh
ssh: command not found
The base container you inherit from might not have the tool installed. Most Docker containers you inherit from are usually with 'bare minimum' in mind, so your custom Docker image needs to install it otherwise.
Just adding the run command that you can add onto the Dockerfile, make sure your user are able to run these. (In this example I made sure the container image user is root) This example installs only the ssh-client only (which is what is required)
USER root
RUN apt-get update \
&& apt-get install openssh-client
USER mydockercontaineruser
This is the docker image we use to host docker-connect with the plugins
FROM confluentinc/cp-kafka-connect:5.3.1
ENV CONNECT_PLUGIN_PATH=/usr/share/java
# JDBC-MariaDB
RUN wget -nv -P /usr/share/java/kafka-connect-jdbc/ https://downloads.mariadb.com/Connectors/java/connector-java-2.4.4/mariadb-java-client-2.4.4.jar
# SNMP Source
RUN wget -nv -P /tmp/ https://github.com/name/kafka-connect-snmp/releases/download/0.0.1.11/kafka-connect-snmp-0.0.1.11.tar.gz
RUN mkdir /tmp/kafka-connect-snmp && tar -xf /tmp/kafka-connect-snmp-0.0.1.11.tar.gz -C /tmp/kafka-connect-snmp/
RUN mv /tmp/kafka-connect-snmp/usr/share/kafka-connect/kafka-connect-snmp /usr/share/java/
I run this docker via docker-compose and then I have specified some common env variables defined here https://docs.confluent.io/current/installation/docker/config-reference.html#kafka-connect-configuration
But I also would like to specify connector related config from the env variable also, example I have done this
- CONNECT_NAME=snmp-connector
- CONNECT_CONNECTOR_CLASS=com.github.jcustenborder.kafka.connect.snmp.SnmpTrapSourceConnector
- CONNECT_TOPIC=fm_snmp
What I am trying to do it, instead of calling
curl -X POST -H "Content-Type: application/json" --data '{"name":"","config":{"connector.class":"com.github.jcustenborder.kafka.connect.snmp.SnmpTrapSourceConnector","topic":"fm_snmp"}}' http://localhost:8083/connectors
I want to just specify it via env variables, BUT!! unfortunately its not working. So when I try seeing list of active connectors curl -localhost:8083/connectors/ , then I dont see it listed there.
So finally, my question can I configure it via env variables or only curl is the way?
You can't pass it as environment variables, but you can specify it as part of your Docker startup by passing in a custom command. Here's an example of doing it with Docker Compose. If you're calling docker run itself you'd need to rework this into an appropriate structure:
kafka-connect:
image: confluentinc/cp-kafka-connect:5.3.1
environment:
CONNECT_REST_PORT: 18083
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
[…]
volumes:
- $PWD/scripts:/scripts
command:
- bash
- -c
- |
/etc/confluent/docker/run &
echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
while [ $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) -eq 000 ] ; do
echo -e $$(date) " Kafka Connect listener HTTP state: " $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) " (waiting for 200)"
sleep 5
done
nc -vz kafka-connect 8083
echo -e "\n--\n+> Creating Kafka Connect Elasticsearch sink"
/scripts/create-es-sink.sh
sleep infinity
This calls a connector script, but if you want to embed it directly you can do it like this.
I'd like to serve Tensorfow Model by using OpenFaaS. Basically, I'd like to invoke the "serve" function in such a way that tensorflow serving is going to expose my model.
OpenFaaS is running correctly on Kubernetes and I am able to invoke functions via curl or from the UI.
I used the incubator-flask as example, but I keep receiving 502 Bad Gateway all the time.
The OpenFaaS project looks like the following
serve/
- Dockerfile
stack.yaml
The inner Dockerfile is the following
FROM tensorflow/serving
RUN mkdir -p /home/app
RUN apt-get update \
&& apt-get install curl -yy
RUN echo "Pulling watchdog binary from Github." \
&& curl -sSLf https://github.com/openfaas-incubator/of-watchdog/releases/download/0.4.6/of-watchdog > /usr/bin/fwatchdog \
&& chmod +x /usr/bin/fwatchdog
WORKDIR /root/
# remove unecessery logs from S3
ENV TF_CPP_MIN_LOG_LEVEL=3
ENV AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
ENV AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
ENV AWS_REGION=${AWS_REGION}
ENV S3_ENDPOINT=${S3_ENDPOINT}
ENV fprocess="tensorflow_model_server --rest_api_port=8501 \
--model_name=${MODEL_NAME} \
--model_base_path=${MODEL_BASE_PATH}"
# Set to true to see request in function logs
ENV write_debug="true"
ENV cgi_headers="true"
ENV mode="http"
ENV upstream_url="http://127.0.0.1:8501"
# gRPC tensorflow serving
# EXPOSE 8500
# REST tensorflow serving
# EXPOSE 8501
RUN touch /tmp/.lock
HEALTHCHECK --interval=5s CMD [ -e /tmp/.lock ] || exit 1
CMD [ "fwatchdog" ]
the stack.yaml file looks like the following
provider:
name: faas
gateway: https://gateway-url:8080
functions:
serve:
lang: dockerfile
handler: ./serve
image: repo/serve-model:latest
imagePullPolicy: always
I build the image with faas-cli build -f stack.yaml and then I push it to my docker registry with faas-cli push -f stack.yaml.
When I execute faas-cli deploy -f stack.yaml -e AWS_ACCESS_KEY_ID=... I get Accepted 202 and it appears correctly among my functions. Now, I want to invoke the tensorflow serving on the model I specified in my ENV.
The way I try to make it work is to use curl in this way
curl -d '{"inputs": ["1.0", "2.0", "5.0"]}' -X POST https://gateway-url:8080/function/deploy-model/v1/models/mnist:predict
but I always obtain 502 Bad Gateway.
Does anybody have experience with OpenFaaS and Tensorflow Serving? Thanks in advance
P.S.
If I run tensorflow serving without of-watchdog (basically without the openfaas stuff), the model is served correctly.
Elaborating the link mentioned by #viveksyngh.
tensorflow-serving-openfaas:
Example of packaging TensorFlow Serving with OpenFaaS to be deployed and managed through OpenFaaS with auto-scaling, scale-from-zero and a sane configuration for Kubernetes.
This example was adapted from: https://www.tensorflow.org/serving
Pre-reqs:
OpenFaaS
OpenFaaS CLI
Docker
Instructions:
Clone the repo
$ mkdir -p ~/dev/
$ cd ~/dev/
$ git clone https://github.com/alexellis/tensorflow-serving-openfaas
Clone the sample model and copy it to the function's build context
$ cd ~/dev/tensorflow-serving-openfaas
$ git clone https://github.com/tensorflow/serving
$ cp -r serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu ./ts-serve/saved_model_half_plus_two_cpu
Edit the Docker Hub username
You need to edit the stack.yml file and replace alexellis2 with your Docker Hub account.
Build the function image
$ faas-cli build
You should now have a Docker image in your local library which you can deploy to a cluster with faas-cli up
Test the function locally
All OpenFaaS images can be run stand-alone without OpenFaaS installed, let's do a quick test, but replace alexellis2 with your own name.
$ docker run -p 8081:8080 -ti alexellis2/ts-serve:latest
Now in another terminal:
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://127.0.0.1:8081/v1/models/half_plus_two:predict
{
"predictions": [2.5, 3.0, 4.5
]
}
From here you can run faas-cli up and then invoke your function from the OpenFaaS UI, CLI or REST API.
$ export OPENFAAS_URL=http://127.0.0.1:8080
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' $OPENFAAS_URL/function/ts-serve/v1/models/half_plus_two:predict
{
"predictions": [2.5, 3.0, 4.5
]
}
Other database dockers that I've worked with (like Postgres) have a mechanism to import some initial data into their empty instance once the container starts for the first time. This is usually in form of putting your SQL files in a specific folder.
I need to do the same for Neo4j. I want to compose a Neo4j docker image with some data in it. What's the right way to do this?
This could be achieved...
There are 2 requirements:
set initial password, which could be achieved using bin/neo4j-admin set-initial-password <password> and then
import data from file in cypher format cat import/data.cypher | NEO4J_USERNAME=neo4j NEO4J_PASSWORD=${NEO4J_PASSWD} bin/cypher-shell --fail-fast
Sample Dockerfile may look like this
FROM neo4j:3.2
ENV NEO4J_PASSWD neo4jadmin
ENV NEO4J_AUTH neo4j/${NEO4J_PASSWD}
COPY data.cypher /var/lib/neo4j/import/
VOLUME /data
CMD bin/neo4j-admin set-initial-password ${NEO4J_PASSWD} || true && \
bin/neo4j start && sleep 5 && \
for f in import/*; do \
[ -f "$f" ] || continue; \
cat "$f" | NEO4J_USERNAME=neo4j NEO4J_PASSWORD=${NEO4J_PASSWD} bin/cypher-shell --fail-fast && rm "$f"; \
done && \
tail -f logs/neo4j.log
Building image sudo docker build -t neo4j-3.1:loaddata .
And running container docker run -it --rm --name neo4jtest neo4j-3.1:loaddata
example of docker-compose for Neo4j
version: '3'
services:
# ...
neo4j:
image: 'neo4j:4.1'
ports:
- '7474:7474'
- '7687:7687'
volumes:
- '$HOME/data:/data'
- '$HOME/logs:/logs'
- '$HOME/import:/var/lib/neo4j/import'
- '$HOME/conf:/var/lib/neo4j/conf'
environment:
NEO4J_AUTH : 'neo4j/your_password'
# ...