Context: I have a master-worker system on celery + rabbitmq stack.
System is dockerized( worker service is not presented here )
version: '2'
services:
rabbit:
hostname: rabbit
image: rabbitmq:latest
environment:
- RABBITMQ_DEFAULT_USER=admin
- RABBITMQ_DEFAULT_PASS=mypass
ports:
- "5672:5672"
master:
build:
context: .
dockerfile: dockerfile
volumes:
- .:/app
links:
- rabbit
depends_on:
- rabbit
When i execute docker-compose up - everything OK!
Problems: But I can't use docker-compose up, I need to use docker-compose master and docker-compose worker (two separate commands for worker and master machines). So, when I execute docker-compose master - the container launches, but hangs up!:
Research: I have found out, that it hangs up on task submitting:
result = longtime_add.delay(count) Where longtime_add is a task.
Full code: https://github.com/waryak/MastersDiploma/tree/vlad
Also, please, edit the title - I feel, that it needs more clear title
A couple of quick points: (1) I didn't see the expected output messages for the producer broker url that you have in github; (2) I couldn't find where /src/network was added to your pythonpath; and (3) the code that loads the producer broker url in celery.py looks wrong as it is looking for the CONFIG variable and not PRODUCE_BROKER_URL as it is in the variables.env file. The reason that the producer would timeout is if it can't connect to the broker, so you'r eon the right track by printing out the produce and worker broker URLs. It may just be easier for you to try hardcoding the broker_url in the producer first:
from celery.app import Celery
app = Celery(broker_url='amqp://admin:mypass/rabbit:56772')
app.send_task(name='messages.tasks.longtime_add', kwargs={})
I just tried docker-compose up rabbit master and it worked out. That's strange, because i can see no external differences in broker logs or any other logs. Also, docker documentation assured, that all depended services would be launched...
Related
I've got a very simple single-host docker compose setup:
version: "3"
services:
bukofka:
image: picoglavar
restart: always
environment:
- PORT=8000
- MODEL=/models/large
volumes:
- glavar:/models
chlenix:
image: picoglavar
restart: always
environment:
- PORT=8000
- MODEL=/models/small
volumes:
- glavar:/models
# ... other containers ...
As you can see, it's only two services based off a single image, so nothing special really. When I open up docker ps I can see these two services churning. And there I open htop and see that each python application is run at least four times; this is very surprising because I haven't setup any in-container kind of replication, and I'm not running this in any kind of swarm mode.
Why does this happen?
I'm a complete idiot. And colour blind too, apparently.
Note that the lines in green are threads, not processes: https://superuser.com/a/1496571/173193
per #nick-odell
The problem
I am attempting to create a docker-compose file that will host three services. InfluxDB, Grafana, and a custom script in a customer Dockerfile that fills the database. I am having networking issues, and the custom script is not able to connect to the InfluxDB due to a connection refused error (shown below).
What is working so far
The interesting thing is, that when I remove the custom script service (called ads_agent) from my docker-compose file and either run that script from the localhost or even build and run that Dockerfile in its own container, it connects just fine.
What's the difference between the two
My script reads an environment variable called KTS_TELEMETRY_INFLUXDB_URL which is used for the InfluxDB client's URL to connect to. I can use "http://localhost:8086" for the URL when I run just from my command line, that works. I use my local machine's LAN IP address when I wrap the script in a Docker container because to it, localhost would be just the container. But nonetheless, this works just fine.
From within my docker-compose, since all three services are on the same network, I'm using "http://influxdb:8086" since that host name should be bound to that service's network interface. And indeed it is, because Grafana is connecting just fine using that URL. Sadly, when I try this with the script, I get connection refused.
The error
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f18c1fec970>: Failed to establish a new connection: [Errno 111] Connection refused
My code
This is my docker-compose.yaml
version: "3"
services:
influxdb:
container_name: influxdb
image: influxdb:2.0.9-alpine # influxdb:latest
networks:
- telemetry_network
ports:
- 8086:8086
volumes:
- influxdb-storage:/var/lib/influxdb2
restart: always
environment:
- DOCKER_INFLUXDB_INIT_MODE=setup
- DOCKER_INFLUXDB_INIT_USERNAME=$KTS_TELEMETRY_INFLUXDB_USERNAME
- DOCKER_INFLUXDB_INIT_PASSWORD=$KTS_TELEMETRY_INFLUXDB_PASSWORD
- DOCKER_INFLUXDB_INIT_ORG=$KTS_TELEMETRY_INFLUXDB_ORG
- DOCKER_INFLUXDB_INIT_BUCKET=$KTS_TELEMETRY_INFLUXDB_BUCKET
- DOCKER_INFLUXDB_INIT_RETENTION=$KTS_TELEMETRY_INFLUXDB_RETENTION
- DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=$KTS_TELEMETRY_INFLUXDB_TOKEN
grafana:
container_name: grafana
image: grafana/grafana:8.1.7 # grafana/grafana:latest
networks:
- telemetry_network
ports:
- 3000:3000
volumes:
- grafana-storage:/var/lib/grafana
restart: always
depends_on:
- influxdb
ads_agent:
container_name: ads_agent
build: ./ads_agent
networks:
- telemetry_network
restart: always
depends_on:
- influxdb
environment:
- KTS_TELEMETRY_INFLUXDB_URL=http://influxdb:8086
- KTS_TELEMETRY_INFLUXDB_TOKEN=$KTS_TELEMETRY_INFLUXDB_TOKEN
- KTS_TELEMETRY_INFLUXDB_ORG=$KTS_TELEMETRY_INFLUXDB_ORG
- KTS_TELEMETRY_INFLUXDB_BUCKET=$KTS_TELEMETRY_INFLUXDB_BUCKET
networks:
telemetry_network:
volumes:
influxdb-storage:
grafana-storage:
This is my ads_agent/Dockerfile
FROM python:3.9
COPY requirements.txt .
RUN pip install --upgrade pip
RUN pip install -r /requirements.txt
COPY main.py .
ENTRYPOINT /usr/local/bin/python3 /main.py
ads_agent/requirements.txt just has the influxdb-client and this is my ads/main.py
import os
from influxdb_client import InfluxDBClient, Point, WritePrecision
from influxdb_client.client.write_api import SYNCHRONOUS
from datetime import datetime
import random
import time
token = os.environ["KTS_TELEMETRY_INFLUXDB_TOKEN"]
org = os.environ["KTS_TELEMETRY_INFLUXDB_ORG"]
bucket = os.environ["KTS_TELEMETRY_INFLUXDB_BUCKET"]
url = os.environ["KTS_TELEMETRY_INFLUXDB_URL"]
client = InfluxDBClient(url=url, token=token)
dbh = client.write_api(write_options=SYNCHRONOUS)
while True:
symbol_name = 'rand_num'
value = random.random()
timestamp = datetime.utcnow()
print(timestamp, symbol_name, value)
point = Point("mem") \
.field(symbol_name, value) \
.time(timestamp, WritePrecision.NS)
dbh.write(bucket, org, point)
time.sleep(1)
Your problem not related to network connectivity, just related to startup order. Although you define depends_on - influxdb for ads_agent, still will have chance
that when your script try to connect the influxdb, the influx db still not finish up.
This is why you can success if you do it manually, as there is time delay there for your manual operation, at that time, the db already ready.
Reason see this:
depends_on does not wait for db and redis to be “ready” before starting web - only until they have been started. If you need to wait for a service to be ready.
)
To assure your db really up before your script starts, you need to refers to Control startup and shutdown order in Compose:
To handle this, design your application to attempt to re-establish a connection to the database after a failure. If the application retries the connection, it can eventually connect to the database.
The best solution is to perform this check in your application code, both at startup and whenever a connection is lost for any reason. However, if you don’t need this level of resilience, you can work around the problem with a wrapper script:
Use a tool such as wait-for-it, dockerize, sh-compatible wait-for, or RelayAndContainers template. These are small wrapper scripts which you can include in your application’s image to poll a given host and port until it’s accepting TCP connections.
For example, to use wait-for-it.sh or wait-for to wrap your service’s command:
version: "2"
services:
web:
build: .
ports:
- "80:8000"
depends_on:
- "db"
command: ["./wait-for-it.sh", "db:5432", "--", "python", "app.py"]
db:
image: postgres
Alternatively, write your own wrapper script to perform a more application-specific health check.
I am trying to take a very difficult, multi-step docker setup and make it into an easy docker-compose. I haven't really used docker-compose before. I am used to using a dockerfile to build and image, then using something like
docker run --name mysql -v ${PWD}/sql-files:/docker-entrypoint-initdb.d ... -h mysql -d mariadb:10.4
Then running the web app in the same manner after building the dockerfile that is simple. Trying to combine these into a docker-compose.yml file seems to be quite difficult. I'll post up my docker-compose.yml file, edited to remove passwords and such and the error I am getting, hopefully someone can figure out why it's failing, because I have no idea...
docker-compose.yml
version: "3.7"
services:
mysql:
image: mariadb:10.4
container_name: mysql
environment:
MYSQL_ROOT_PASSWORD: passwordd1
MYSQL_ALLOW_EMPTY_PASSWORD: "true"
volumes:
- ./sql-files:/docker-entrypoint-initdb.d
- ./ssl:/var/lib/mysql/ssl
- ./tls.cnf:/etc/mysql/conf.d/tls.cnf
healthcheck:
test: ["CMD", "mysqladmin ping"]
interval: 10s
timeout: 2s
retries: 10
web:
build: ./base/newdockerfile
container_name: web
hostname: dev.website.com
volumes:
- ./ssl:/etc/httpd/ssl
- ./webfiles:/var/www
depends_on:
mysql:
condition: service_healthy
ports:
- "8443:443"
- "8888:80"
entrypoint:
- "/sbin/httpd"
- "-D"
- "FOREGROUND"
The error I get when running docker-compose up in the terminal window is...
Service 'web' depends on service 'mysql' which is undefined.
Why would mysql be undefined. It's the first definition in the yml file and has a health check associated with it. Also, it fails very quickly, like within a few seconds, so there's no way all the healthchecks ran and failed, and I'm not getting any other errors in the terminal window. I do docker-compose up and within a couple seconds I get that error. Any help would be greatly appreciated. Thank you.
according to this documentation
depends_on does not wait for db and redis to be “ready” before
starting web - only until they have been started. If you need to wait
for a service to be ready, see Controlling startup order for more on
this problem and strategies for solving it.
Designing your application so it's resilient when database is not available or set up yet is what we all have to deal with. Healthcheck doesn't guarantee you database is ready before the next stage. The best way is probably write a wait-for-it script or wait-for and run it after depends-on:
depends_on:
- "db"
command: ["./wait-for-it.sh"]
I have a docker container which is brought up with the help of a docker-compose file along with a database container. I want to do this:
Keep the database container running
Schedule the container with my python program to run daily, generate results and stop again
This is my configuration file:
version: '3.7'
services:
database:
container_name: sql_database
image: mysql:latest
command: --init-file /docker-entrypoint-initdb.d/init.sql
ports:
- 13306:3306
environment:
MYSQL_ROOT_PASSWORD: root
volumes:
- ./backup.sql:/docker-entrypoint-initdb.d/init.sql
python-container:
container_name: python-container
build: ./python_project
command: python main.py
depends_on:
- database
volumes:
- myvol:/python_project/data
volumes:
myvol:
Can someone please help me with this? Thanks!
I was just about to ask the same thing. Seems silly to keep a container going 24/7 just to run one job a day (in my case, certbot renew).
I think there may be a way to fake this using the RESTART_POLICY option with a long delay and no maximum retries, but I haven't tried it yet.
EDIT: Apparently restart_policy only works for swarms. Pity.
If the underlying container has a bash shell, you set the command to run a loop with a delay, like this:
while true; do python main.py; sleep 1; done
I want to restart a container if it crashes automatically. I am not sure how to go about doing this. I have a script docker-compose-deps.yml that has elasticsearch, redis, nats, and mongo. I run this in the terminal to set this up: docker-compose -f docker-compose-deps.yml up -d. After this I set up my containers by running: docker-compose up -d. Is there a way to make these containers restart if they crash? I noticed that docker has a built in restart, but I don't know how to implement this.
After some feedback I added restart: always to my docker-compose file and my docker-compose-deps.yml file. Does this look correct? Or is this how you would implement the restart always?
docker-compose sample
myproject-server:
build: "../myproject-server"
dockerfile: Dockerfile-dev
restart: always
ports:
- 5880:5880
- 6971:6971
volumes:
- "../myproject-server/src:/src"
working_dir: "/src"
external_links:
- nats
- mongo
- elasticsearch
- redis
myproject-associate:
build: "../myproject-associate"
dockerfile: Dockerfile-dev
restart: always
ports:
- 5870:5870
volumes:
- "../myproject-associate/src:/src"
working_dir: "/src"
external_links:
- nats
- mongo
- elasticsearch
- redis
docker-compose-deps.yml sample
nats:
image: nats
container_name: nats
restart: always
ports:
- 4222:4222
mongo:
image: mongo
container_name: mongo
restart: always
volumes:
- "./data:/data"
ports:
- 27017:27017
If you're using compose, it has a restart flag which is analogous to the one existing in the docker run command, so you can use that. Here is a link to the documentation about this part -
https://docs.docker.com/compose/compose-file/
When you deploy out, it depends where you deploy to. Most container clusters like kubernetes, mesos or ECS would have some configuration you can use to auto-restart your containers. If you don't use any of these tools you are probably starting your containers manually and can then just use the restart flag just as you would locally.
Looks good to me. What you want to understand when working on Docker policies is what each one means. always policy means that if it crashes for any reason automatically restart.
So if it stops for any reason, go ahead and restart it.
So why would you ever want to use always as opposed to say on-failure?
In some cases, you might have a container that you always want to ensure is running such as a web server. If you are running a public web application chances are you want that server to be available 100% of the time.
So for web application I expect you want to use always. On the other hand if you are running a worker process on a file and then naturally exit, that would be a good use case for the on-failure policy, because the worker container might be finished processing the file and you probably want to let it close out and not have it restart.
Thats where I would expect to use the on-failure policy. So not just knowing the syntax, but when to apply which policy and what each one means.