I'm trying to setup a Spark development environment with Zeppelin on Docker, but I'm having trouble connecting the Zeppelin and Spark containers.
I'm deploying a Docker Stack, with the current docker-compose
version: '3'
services:
spark-master:
image: gettyimages/spark
command: bin/spark-class org.apache.spark.deploy.master.Master -h spark-master
hostname: spark-master
environment:
SPARK_CONF_DIR: /conf
SPARK_PUBLIC_DNS: 10.129.34.90
volumes:
- spark-master-volume:/conf
- spark-master-volume:/tmp/data
ports:
- 8000:8080
spark-worker:
image: gettyimages/spark
command: bin/spark-class org.apache.spark.deploy.worker.Worker spark://spark-master:7077
hostname: spark-worker
environment:
SPARK_MASTER_URL: spark-master:7077
SPARK_CONF_DIR: /conf
SPARK_PUBLIC_DNS: 10.129.34.90
SPARK_WORKER_CORES: 2
SPARK_WORKER_MEMORY: 2g
volumes:
- spark-worker-volume:/conf
- spark-worker-volume:/tmp/data
ports:
- "8081-8100:8081-8100"
zeppelin:
image: apache/zeppelin:0.8.0
ports:
- 8080:8080
- 8443:8443
volumes:
- spark-master-volume:/opt/zeppelin/logs
- spark-master-volume:/opt/zeppelin/notebookcd
environment:
MASTER: "spark://spark-master:7077"
SPARK_MASTER: "spark://spark-master:7077"
SPARK_HOME: /usr/spark-2.4.1
depends_on:
- spark-master
volumes:
spark-master-volume:
driver: local
spark-worker-volume:
driver: local
It builds normally, but when I try to run Spark on Zeppelin, it throws me:
java.lang.RuntimeException: /zeppelin/bin/interpreter.sh: line 231: /usr/spark-2.4.1/bin/spark-submit: No such file or directory
I think the problem is in the volumes, but I can't get how to do it right.
You need to install spark on your zeppelin docker instance to use spark-submit and update the spark interpreter config to point it to your spark cluster
zeppelin_notebook_server:
container_name: zeppelin_notebook_server
build:
context: zeppelin/
restart: unless-stopped
volumes:
- ./zeppelin/config/interpreter.json:/zeppelin/conf/interpreter.json:rw
- ./zeppelin/notebooks:/zeppelin/notebook
- ../sample-data:/sample-data:ro
ports:
- "8085:8080"
networks:
- general
labels:
container_group: "notebook"
spark_base:
container_name: spark-base
build:
context: spark/base
image: spark-base:latest
spark_master:
container_name: spark-master
build:
context: spark/master/
networks:
- general
hostname: spark-master
ports:
- "3030:8080"
- "7077:7077"
environment:
- "SPARK_LOCAL_IP=spark-master"
depends_on:
- spark_base
volumes:
- ./spark/apps/jars:/opt/spark-apps
- ./spark/apps/data:/opt/spark-data
- ../sample-data:/sample-data:ro
spark_worker_1:
container_name: spark-worker-1
build:
context: spark/worker/
networks:
- general
hostname: spark-worker-1
ports:
- "3031:8081"
env_file: spark/spark-worker-env.sh
environment:
- "SPARK_LOCAL_IP=spark-worker-1"
depends_on:
- spark_master
volumes:
- ./spark/apps/jars:/opt/spark-apps
- ./spark/apps/data:/opt/spark-data
- ../sample-data:/sample-data:ro
spark_worker_2:
container_name: spark-worker-2
build:
context: spark/worker/
networks:
- general
hostname: spark-worker-2
ports:
- "3032:8082"
env_file: spark/spark-worker-env.sh
environment:
- "SPARK_LOCAL_IP=spark-worker-2"
depends_on:
- spark_master
volumes:
- ./spark/apps/jars:/opt/spark-apps
- ./spark/apps/data:/opt/spark-data
- ../sample-data:/sample-data:ro
Zeppelin docker file:
FROM "apache/zeppelin:0.8.1"
RUN wget http://apache.mirror.iphh.net/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz --progress=bar:force && \
tar xvf spark-2.4.3-bin-hadoop2.7.tgz && \
mkdir -p /usr/local/spark && \
mv spark-2.4.3-bin-hadoop2.7/* /usr/local/spark/. && \
mkdir -p /sample-data
ENV SPARK_HOME "/usr/local/spark/"
Make sure your zeppelin spark interpreter config is same as:
Build a Dockerfile with content
FROM gettyimages/spark
ENV APACHE_SPARK_VERSION 2.4.1
ENV APACHE_HADOOP_VERSION 2.8.0
ENV ZEPPELIN_VERSION 0.8.1
RUN apt-get update
RUN set -x \
&& curl -fSL "http://www-eu.apache.org/dist/zeppelin/zeppelin-0.8.1/zeppelin-0.8.1-bin-all.tgz" -o /tmp/zeppelin.tgz \
&& tar -xzvf /tmp/zeppelin.tgz -C /opt/ \
&& mv /opt/zeppelin-* /opt/zeppelin \
&& rm /tmp/zeppelin.tgz
ENV SPARK_SUBMIT_OPTIONS "--jars /opt/zeppelin/sansa-examples-spark-2016-12.jar"
ENV SPARK_HOME "/usr/spark-2.4.1/"
WORKDIR /opt/zeppelin
CMD ["/opt/zeppelin/bin/zeppelin.sh"]
and then define your service within your docker-compose.yml file with prefix
version: '3'
services:
zeppelin:
build: ./zeppelin
image: zeppelin:0.8.1-hadoop-2.8.0-spark-2.4.1
...
Finally, use docker-compose -f docker-compose.yml build to build the customised image before docker stack deploy
Related
i have several flask applications which i want to run on a server as separate docker containers. on the server i already have several websites running with a reverse proxy and the letsencrypt-nginx-proxy-companion. unfortunately i can't get the containers to run. I think it is because of the port mapping. When I start the containers on port 80, I get the following error message "[ERROR] Can't connect to ('', 80)" from gunicorn. On all other ports it starts successfully, but then I can't access it from outside.
what am I doing wrong?
docker-compose.yml
version: '3'
services:
db:
image: "mysql/mysql-server:5.7"
env_file: .env-mysql
restart: always
app:
build: .
env_file: .env
expose:
- "8001"
environment:
- VIRTUAL_HOST:example.com
- VIRTUAL_PORT:'8001'
- LETSENCRYPT_HOST:example.com
- LETSENCRYPT_EMAIL:foo#example.com
links:
- db:dbserver
restart: always
networks:
default:
external:
name: nginx-proxy
Dockerfile
FROM python:3.6-alpine
ARG CONTAINER_USER='flask-user'
ENV FLASK_APP run.py
ENV FLASK_CONFIG docker
RUN adduser -D ${CONTAINER_USER}
USER ${CONTAINER_USER}
WORKDIR /home/${CONTAINER_USER}
COPY requirements requirements
RUN python -m venv venv
RUN venv/bin/pip install -r requirements/docker.txt
COPY app app
COPY migrations migrations
COPY run.py config.py entrypoint.sh ./
# runtime configuration
EXPOSE 8001
ENTRYPOINT ["./entrypoint.sh"]
entrypoint.sh
#!/bin/sh
source venv/bin/activate
flask deploy
exec gunicorn -b :8001 --access-logfile - --error-logfile - run:app
reverse-proxy/docker-compose.yml
version: '3'
services:
nginx:
image: nginx
labels:
com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy: "true"
container_name: nginx
restart: always
ports:
- "80:80"
- "443:443"
volumes:
- /srv/www/nginx-proxy/conf.d:/etc/nginx/conf.d
- /srv/www/nginx-proxy/vhost.d:/etc/nginx/vhost.d
- /srv/www/nginx-proxy/html:/usr/share/nginx/html
- /srv/www/nginx-proxy/certs:/etc/nginx/certs:ro
nginx-gen:
image: jwilder/docker-gen
command: -notify-sighup nginx -watch -wait 5s:30s /etc/docker-gen/templates/nginx.tmpl /etc/nginx/conf.d/default.conf
container_name: nginx-gen
restart: always
volumes:
- /srv/www/nginx-proxy/conf.d:/etc/nginx/conf.d
- /srv/www/nginx-proxy/vhost.d:/etc/nginx/vhost.d
- /srv/www/nginx-proxy/html:/usr/share/nginx/html
- /srv/www/nginx-proxy/certs:/etc/nginx/certs:ro
- /var/run/docker.sock:/tmp/docker.sock:ro
- /srv/www/nginx-proxy/nginx.tmpl:/etc/docker-gen/templates/nginx.tmpl:ro
nginx-letsencrypt:
image: jrcs/letsencrypt-nginx-proxy-companion
container_name: nginx-letsencrypt
restart: always
volumes:
- /srv/www/nginx-proxy/conf.d:/etc/nginx/conf.d
- /srv/www/nginx-proxy/vhost.d:/etc/nginx/vhost.d
- /srv/www/nginx-proxy/html:/usr/share/nginx/html
- /srv/www/nginx-proxy/certs:/etc/nginx/certs:rw
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
NGINX_DOCKER_GEN_CONTAINER: "nginx-gen"
NGINX_PROXY_CONTAINER: "nginx"
DEBUG: "true"
networks:
default:
external:
name: nginx-proxy
I'm running composer on docker container and I'm trying to install predis so I run the command
docker-compose run --rm composer require predis/predis
But I get this error
Can someone explain me how to fix?
that's my dockerfile
FROM php:7.2-fpm-alpine
RUN docker-php-ext-install pdo pdo_mysql
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
RUN composer --version
and that's my docker-compose
services:
nginx:
image: nginx:stable-alpine
container_name: nginx
ports:
- "8088:80"
volumes:
- ./src:/var/www/html
- ./nginx/default.conf:/etc/nginx/conf.d/default.conf
depends_on:
- mysql
- php
networks:
- laravel
mysql:
image: mysql:5.7.22
container_name: mysql
restart: unless-stopped
tty: true
ports:
- "4306:3306"
environment:
MYSQL_DATABASE: homestead
MYSQL_USER: homestead
MYSQL_PASSWORD: secret
MYSQL_ROOT_PASSWORD: secret
SERVICE_TAGS: dev
SERVICE_NAME: mysql
networks:
- laravel
php:
build:
context: .
dockerfile: Dockerfile
container_name: php
volumes:
- ./src:/var/www/html
ports:
- "9000:9000"
networks:
- laravel
redis:
image: redis:5.0.0-alpine
restart: always
container_name: redis
ports:
- "6379:6379"
networks:
- laravel
In your docker-compose file, add an environment entry with COMPOSER_MEMORY_LIMIT: -1 to your php service
So your php services look like -
php:
build:
context: .
dockerfile: Dockerfile
container_name: php
volumes:
- ./src:/var/www/html
environment:
COMPOSER_MEMORY_LIMIT: -1
ports:
- "9000:9000"
networks:
- laravel
I have application built in React running on Docker. I am looking for a way to debug it. I am using Visual Studio Code. Here is my Docker file and Docker-compose file
FROM node:boron
ARG build_env
RUN mkdir /usr/share/unicode && cd /usr/share/unicode && wget ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt
COPY package.json /tmp/package.json
RUN cd /tmp && npm install
COPY ./shim/RelayDefaultNetworkLayer.js /tmp/node_modules/react-relay/lib/RelayDefaultNetworkLayer.js
COPY ./shim/buildRQL.js /tmp/node_modules/react-relay/lib/buildRQL.js
RUN mkdir -p /var/www && cp -a /tmp/node_modules /var/www/
WORKDIR /var/www
COPY . ./
RUN if [ "$build_env" != "development" ]; then npm run build-webpack && npm run gulp; fi
EXPOSE 8080
CMD ["npm", "run", "--debug=5858 prod"]
My docker-compose file looks like
version: '2'
services:
nginx:
container_name: nginx
image: openroad/nginx
build:
context: nginx
ports:
- "80:80"
volumes:
- ./nginx/nginx.development.conf:/etc/nginx/nginx.conf
networks:
- orion-network
graphql:
container_name: graphql
image: openroad/graphql
build:
context: integration_api
volumes:
- ./integration_api:/var/www
environment:
- NODE_ENV=development
command: npm run dev
working_dir: /var/www
networks:
orion-network:
ipv4_address: 172.16.238.10
pegasus:
container_name: pegasus
image: openroad/pegasus
build:
context: pegasus
args:
build_env: development
expose:
- "3000"
volumes:
- ./pegasus:/var/www/public
environment:
- NODE_ENV=development
command: npm run dev
working_dir: /var/www/public
extra_hosts:
- "local.pegasus.com:192.168.99.100"
networks:
orion-network:
ipv4_address: 172.16.238.11
frontend:
container_name: orion-frontend
image: openroad/orion-frontend
build:
context: orion-frontend
args:
build_env: development
expose:
- "3000"
ports:
- "5858:5858"
volumes:
- ./orion-frontend:/var/www/public
environment:
- NODE_ENV=development
command: npm run --debug=5858 dev
working_dir: /var/www/public
networks:
orion-network:
ipv4_address: 172.16.238.12
admin:
container_name: orion-admin
image: openroad/orion-admin
build:
context: orion-admin
args:
build_env: development
expose:
- "3000"
volumes:
- ./orion-admin:/var/www/
environment:
- NODE_ENV=development
command: npm run dev
working_dir: /var/www/
networks:
orion-network:
ipv4_address: 172.16.238.13
uploads:
container_name: orion-uploads
image: openroad/orion-uploads
build:
context: orion-uploads
volumes:
- ./orion-uploads:/var/www/
working_dir: /var/www/
networks:
orion-network:
ipv4_address: 172.16.238.14
dashboard:
container_name: orion-dashboard
image: openroad/orion-dashboard
build:
context: orion-dashboard
args:
build_env: development
volumes:
- ./orion-dashboard/src:/var/www/src
- ./orion-dashboard/package.json:/var/www/package.json
- ./orion-dashboard/webpack.config.babel.js:/var/www/webpack.config.babel.js
- ./orion-dashboard/node_modules:/var/www/node_modules
- ./orion-dashboard/data/babelRelayPlugin.js:/var/www/data/babelRelayPlugin.js
working_dir: /var/www
environment:
- NODE_ENV=development
- GRAPHQLURL=http://172.16.238.10:8080/graphql
- PORT=8080
command: npm run dev
networks:
orion-network:
ipv4_address: 172.16.238.15
networks:
orion-network:
driver: bridge
driver_opts:
com.docker.network.bridge.enable_ip_masquerade: "true"
ipam:
driver: default
config:
- subnet: 172.16.238.0/24
gateway: 172.16.238.1
I wanted ability to debug application running under orion-frontend container. I tried various option without any success. I tried https://codefresh.io/docker-tutorial/debug_node_in_docker/ and https://blog.docker.com/2016/07/live-debugging-docker/ already.
I may be wrong about the command syntax for npm run (didn't find this command in the npm docs), but you may need to separate the --debug=5858 and prod args, like this:
CMD ["npm", "run", "--debug=5858", "prod"]
Here is my docker-compose.yml:
version: '3.4'
services:
nginx:
restart: always
image: nginx:latest
ports:
- 80:80
volumes:
- ./misc/nginx.conf:/etc/nginx/conf.d/default.conf
- /static:/static
depends_on:
- web
web:
restart: always
image: celery-with-docker-compose:latest
build: .
command: bash -c "python /code/manage.py collectstatic --noinput && python /code/manage.py migrate && /code/run_gunicorn.sh"
volumes:
- /static:/data/web/static
- /media:/data/web/media
- .:/code
env_file:
- ./.env
depends_on:
- db
volumes:
- ./app:/deploy/app
worker:
image: celery-with-docker-compose:latest
restart: always
build:
context: .
command: bash -c "pip install -r /code/requirements.txt && /code/run_celery.sh"
volumes:
- .:/code
env_file:
- ./.env
depends_on:
- redis
- web
db:
restart: always
image: postgres
env_file:
- ./.env
volumes:
- pgdata:/var/lib/postgresql/data
ports:
- "5432:5432"
redis:
restart: always
image: redis:latest
privileged: true
command: bash -c "sysctl vm.overcommit_memory=1 && redis-server"
ports:
- "6379:6379"
volumes:
pgdata:
When I run docker stack deploy -c docker-compose.yml cryptex I got
Non-string key at top level: true
And docker-compose -f docker-compose.yml config gives me
ERROR: In file './docker-compose.yml', the service name True must be a quoted string, i.e. 'True'.
I'm using latest versions of docker and compose. Also I'm new to compose v3 and started to use it for getting availability of docker stack command. If you see any mistakes or redudants in config file please, let me know. Thanks
you have to validate you docker compose file, is probably that the have low value inside
Validating your file now is as simple as docker-compose -f
docker-compose.yml config. As always, you can omit the -f
docker-compose.yml part when running this in the same folder as the
file itself or having the
I'm trying to cap the max size of docker's log files. Each container's log file should max out at 100M. So each container such as the edge, worker, etc should only be allowed to have a log file that is 100MB.
I tried to insert:
log_opt:
max-size: 100m
At the end of my docker-compose.yml file below but i'm getting an error.
Where should I place it?. Also when I place it inside each container definition I'm getting an error. I read the docker docs but no where does it say where exactly to place the option.
This is my docker-compose.yml file:
version: '2.0'
services:
ubuntu:
image: ubuntu
volumes:
- box:/box
cache:
image: redis:3.0
rabbitmq:
image: rabbitmq:3-management
volumes:
- ${DATA}/rabbitmq:/var/lib/rabbitmq
ports:
- "15672:15672"
- "5672:5672"
placements-store:
image: redis:3.0
command: redis-server ${REDIS_OPTIONS}
ports:
- "6379:6379"
api:
image: ruby:2.3
command: bundle exec puma -C config/puma.rb
env_file:
- ./.env
working_dir: /app
volumes:
- .:/app/
- box:/box
expose:
- 3000
depends_on:
- cache
- placements-store
worker:
image: ruby:2.3
command: bundle exec sidekiq -C ./config/schedule.yml -q default -q high_priority,5 -c 10
env_file:
- ./.env
working_dir: /app
environment:
INSTANCE_TYPE: worker
volumes:
- .:/app/
- box:/box
depends_on:
- cache
- placements-store
sidekiq-monitor:
image: ruby:2.3
command: bundle exec thin start -R sidekiq.ru -p 9494
env_file:
- ./.env
working_dir: /app
volumes:
- .:/app/
- box:/box
depends_on:
- cache
expose:
- 9494
sneakers:
image: ruby:2.3
command: bundle exec rails sneakers:run
env_file:
- ./.env
working_dir: /app
environment:
INSTANCE_TYPE: worker
volumes:
- .:/app/
- box:/box
depends_on:
- cache
- placements-store
- rabbitmq
edge:
image: ruby:2.3
command: bundle exec thin start -R config.ru -p 3000
environment:
REDIS_URL: redis://placements-store
RACK_ENV: development
BUNDLE_PATH: /box
RABBITMQ_HOST: rabbitmq
working_dir: /app
volumes:
- ./edge:/app/
- box:/box
depends_on:
- placements-store
- rabbitmq
expose:
- 3000
proxy:
image: openresty/openresty:latest-xenial
ports:
- "80:80"
- "443:443"
volumes:
- ./config/nginx.conf:/usr/local/openresty/nginx/conf/nginx.conf
volumes:
box:
# node_modules:
# bower_components:
# client_dist:
This is what I tried, for example inserting under the rabbitmq container:
version: '2.0'
services:
ubuntu:
image: ubuntu
volumes:
- box:/box
cache:
image: redis:3.0
rabbitmq:
image: rabbitmq:3-management
#volumes:
# - ${DATA}/rabbitmq:/var/lib/rabbitmq
ports:
- "15672:15672"
- "5672:5672"
log_opt:
max-size: 50m
placements-store:
image: redis:3.0
command: redis-server ${REDIS_OPTIONS}
ports:
- "6379:6379"
This is the error I get:
ERROR: The Compose file './docker-compose.yml' is invalid because:
Unsupported config option for services.rabbitmq: 'log_opt'
Tried to change log_opt: with options: and got the same error:
ERROR: The Compose file './docker-compose.yml' is invalid because:
Unsupported config option for services.rabbitmq: 'options'
Also the docker version is:
docker --version && docker-compose --version
Docker version 1.11.2, build b9f10c9/1.11.2
docker-compose version 1.9.0, build 2585387
UPDATE:
Tried using the logging option like the doc says (for version 2.0):
version: '2.0'
services:
ubuntu:
image: ubuntu
volumes:
- box:/box
cache:
image: redis:3.0
rabbitmq:
image: rabbitmq:3-management
#volumes:
# - ${DATA}/rabbitmq:/var/lib/rabbitmq
ports:
- "15672:15672"
- "5672:5672"
logging:
driver: "json-file"
options:
max-size: 100m
max-file: 1
placements-store:
image: redis:3.0
command: redis-server ${REDIS_OPTIONS}
ports:
- "6379:6379"
Getting the error:
ERROR: for rabbitmq Cannot create container for service rabbitmq:
json: cannot unmarshal number into Go value of type string ERROR:
Encountered errors while bringing up the project.