Multiple Docker containers and Celery - docker

We have the following structure of the project right now:
Web-server that processes incoming requests from the clients.
Analytics module that provides some recommendations to the users.
We decided to keep these modules completely independent and move them to different docker containers. When a query from a user arrives to the web-server it sends another query to the analytics module to get the recommendations.
For recommendations to be consistent we need to do some background calculations periodically and when, for instance, new users register within our system. Also some background tasks are connected purely with the web-server logic. For this purposes we decided to use a distributed task queue, e.g., Celery.
There are following possible scenarios of task creation and execution:
Task enqueued at the web-server, executed at the web-server (e.g., process uploaded image)
Task enqueued at the web-server, executed at the analytics module (e.g., calculate recommendations for a new user)
Task enqueued at the analytics module and executed there (e.g., periodic update)
So far I see 3 rather weird possibilities to use Celery here:
I. Celery in separate container and does everything
Move Celery to the separate docker container.
Provide all of the necessary packages from both web-server and analytics to execute tasks.
Share tasks code with other containers (or declare dummy tasks at web-server and analytics)
This way, we loose isolation, as the functionality is shared by Celery container and other containers.
II. Celery in separate container and does much less
Same as I, but tasks now are just requests to web-server and analytics module, which are handled asynchronously there, with the result polled inside the task until it is ready.
This way, we get benefits from having the broker, but all heavy computations are moved from Celery workers.
III. Separate Celery in each container
Run Celery both in web-server and analytics module.
Add dummy task declarations (of analytics tasks) to web-server.
Add 2 task queues, one for web-server, one for analytics.
This way, tasks scheduled at web-server could be executed in analytics module. However, still have to share code of tasks across the containers or use dummy tasks, and, additionally, need to run celery workers in each container.
What is the best way to do this, or the logic should be changed completely, e.g., move everything inside one container?

First, let clarify the difference between celery library (which you get with pip install or in your setup.py) and celery worker - which is the actual process that dequeue tasks from the broker and handle them. Of course you might wanna have multiple workers/processes (for separating different task to a different worker - for example).
Lets say you have two tasks: calculate_recommendations_task and periodic_update_task and you want to run them on a separate worker i.e recommendation_worker and periodic_worker.
Another process will be celery beat which just enqueue the periodic_update_task into the broker each x hours.
In addition, let's say you have simple web server implemented with bottle.
I'll assume you want to use celery broker & backend with docker too and I'll pick the recommended usage of celery - RabbitMQ as broker and Redis as backend.
So now we have 6 containers, I'll write them in a docker-compose.yml:
version: '2'
services:
rabbit:
image: rabbitmq:3-management
ports:
- "15672:15672"
- "5672:5672"
environment:
- RABBITMQ_DEFAULT_VHOST=vhost
- RABBITMQ_DEFAULT_USER=guest
- RABBITMQ_DEFAULT_PASS=guest
redis:
image: library/redis
command: redis-server /usr/local/etc/redis/redis.conf
expose:
- "6379"
ports:
- "6379:6379"
recommendation_worker:
image: recommendation_image
command: celery worker -A recommendation.celeryapp:app -l info -Q recommendation_worker -c 1 -n recommendation_worker#%h -Ofair
periodic_worker:
image: recommendation_image
command: celery worker -A recommendation.celeryapp:app -l info -Q periodic_worker -c 1 -n periodic_worker#%h -Ofair
beat:
image: recommendation_image
command: <not sure>
web:
image: web_image
command: python web_server.py
both dockerfiles, which builds the recommendation_image and the web_image should install celery library. Only the recommendation_image should have the tasks code because the workers are going to handle those tasks:
RecommendationDockerfile:
FROM python:2.7-wheezy
RUN pip install celery
COPY tasks_src_code..
WebDockerfile:
FROM python:2.7-wheezy
RUN pip install celery
RUN pip install bottle
COPY web_src_code..
The other images (rabbitmq:3-management & library/redis are available from docker hub and they will be pulled automatically when you run docker-compose up).
Now here is the thing: In you web server you can trigger celery tasks by their string name and pull the result by task-ids (without sharing the code) web_server.py:
import bottle
from celery import Celery
rabbit_path = 'amqp://guest:guest#rabbit:5672/vhost'
celeryapp = Celery('recommendation', broker=rabbit_path)
celeryapp.config_from_object('config.celeryconfig')
#app.route('/trigger_task', method='POST')
def trigger_task():
r = celeryapp.send_task('calculate_recommendations_task', args=(1, 2, 3))
return r.id
#app.route('/trigger_task_res', method='GET')
def trigger_task_res():
task_id = request.query['task_id']
result = celery.result.AsyncResult(task_id, app=celeryapp)
if result.ready():
return result.get()
return result.state
last file config.celeryconfig.py:
CELERY_ROUTES = {
'calculate_recommendations_task': {
'exchange': 'recommendation_worker',
'exchange_type': 'direct',
'routing_key': 'recommendation_worker'
}
}
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']

Related

How to package several services in one docker image?

I have a docker compose application, which works fine in local. I would like to create an image from it and upload it to the docker hub in order to pull it from my azure virtual machine without passing all files. Is this possible? How can I do it?
I tried to upload the image I see from docker desktop and then pull it from the VM but the container does not start up.
Here I attach my .yml file. There is only one service at the moment but in the future there will be multiple microservices, this is why I want to use compose.
version: "3.8"
services:
dbmanagement:
build: ./dbmanagement
container_name: dbmanagement
command: python manage.py runserver 0.0.0.0:8000
volumes:
- ./dbmanagement:/dbmandj
ports:
- "8000:8000"
environment:
- POSTGRES_HOST=*******
- POSTGRES_NAME=*******
- POSTGRES_USER=*******
- POSTGRES_PASSWORD=*******
Thank you for your help
The answer is: yes, you can but you should not
According to the Docker official docs:
It is generally recommended that you separate areas of concern by using one service per container
Also check this:
https://stackoverflow.com/a/68593731/3957754
docker-compose is enough
docker-compose exist just for that: Run several services with one click (minimal configurations) and commonly in the same server.
foreground process
In order to works a docker container needs a foreground process. To understand what is this, check the following links. As a extremely summary we can said you that a foreground process is something that when you launch it using the shell, the shell is taken and you can and you cannot enter more commands. You need to press ctrl + c to kill the process and get back your shell.
https://unix.stackexchange.com/questions/175741/what-is-background-and-foreground-processes-in-jobs
https://linuxconfig.org/understanding-foreground-and-background-linux-processes
The "fat" container
Anyway, if you want to join several services or process in one container (previously an image) you can do it with supervisor.
Supervisor could works a our foreground process. Basically you need to register one or many linux processes and then, supervisor will start them.
how to install supervisor
sudo apt-get install supervisor
source: https://gist.github.com/hezhao/bb0bee800531b89d7be1#file-supervisor_cmd-sh
add single config: /etc/supervisor/conf.d/myapp.conf
[program:myapp]
autostart = true
autorestart = true
command = python /home/pi/myapp.py
environment=SECRET_ID="secret_id",SECRET_KEY="secret_key_avoiding_%_chars"
stdout_logfile = /home/pi/stdout.log
stderr_logfile = /home/pi/stderr.log
startretries = 3
user = pi
source: https://gist.github.com/hezhao/bb0bee800531b89d7be1
start it
sudo supervisorctl start myapp
sudo supervisorctl tail myapp
sudo supervisorctl status
In the previous sample, we are used supervisor to start a python process.
multiple process with supervisor
You just need to add more [program] sections to the config file:
[program:php7.2]
command=/usr/sbin/php-fpm7.2-zts
process_name=%(program_name)s
autostart=true
autorestart=true
[program:dropbox]
process_name=%(program_name)s
command=/app/.dropbox-dist/dropboxd
autostart=true
autorestart=true
Here some examples, just like your requirement: several process in one container:
canvas lms : Basically starts 3 process: postgress, redis and a ruby app
https://github.com/harvard-dce/canvas-docker/blob/master/assets/supervisord.conf
ngnix + php + ssh
https://gist.github.com/pollend/b1f275eb7f00744800742ae7ce403048#file-supervisord-conf
nginx + php
https://gist.github.com/lovdianchel/e306b84437bfc12d7d33246d8b4cbfa6#file-supervisor-conf
mysql + redis + mongo + nginx + php
https://gist.github.com/nguyenthanhtung88/c599bfdad0b9088725ceb653304a91e3
Also you could configure a web dashboard:
https://medium.com/coinmonks/when-you-throw-a-web-crawler-to-a-devops-supervisord-562765606f7b
Another samples with docker + supervisor:
https://gist.github.com/chadrien/7db44f6093682bf8320c
https://gist.github.com/damianospark/6a429099a66bfb2139238b1ce3a05d79

How did the docker service manage to call instance from a seperate docker container?

I have recently started using Docker+Celery. I have also shared the full sample codes for this example on github and the following are some code snippets from it to help explain my point.
For context, my example is designed to be a node that subscribes to events in a system of microservices. In this node, it comprises of the following services:
the Subscriber (using kombu to subscribe to events)
the Worker (using celery for async task acting on the events)
Redis (as message broker and result backend for celery)
The services are defined in a docker-compose.yml file as follows:
version: "3.7"
services:
# To launch the Subscriber (using Kombu incl. in Celery)
subscriber:
build: .
tty: true
#entrypoint: ...
# To launch Worker (Celery)
worker:
build: .
entrypoint: celery worker -A worker.celery_app --loglevel=info
depends_on:
- redis
redis:
image: redis
ports:
- 6379:6379
entrypoint: redis-server
For simplicity, I have left out codes for the subscriber and I thought using the python interactive shell in the subscriber container for this example should suffice:
python3
>>> from worker import add
>>> add.delay(2,3).get()
5
And in the worker container logs:
worker_1 | [2020-09-17 10:12:34,907: INFO/ForkPoolWorker-2] worker.add[573cff6c-f989-4d06-b652-96ae58d0a45a]: Adding 2 + 3, res: 5
worker_1 | [2020-09-17 10:12:34,919: INFO/ForkPoolWorker-2] Task worker.add[573cff6c-f989-4d06-b652-96ae58d0a45a] succeeded in 0.011764664999645902s: 5
While everything seems to be working, I felt uneasy. I thought this example doesn't respect the isolation principle of a docker container.
Aren't containers designed to be isolated to the level of it's OS,
processes and network? And if containers have to communicate, shouldn't it be done via IP address and network protocols (TCP/UDP etc.)
Firstly, the worker and subscriber run the same codebase in my example, thus no issue is expected on the import statement.
However, the celery worker is launched from the entrypoint in the worker container, thus, how did the subscriber manage to call the celery worker instance in the supposedly isolated worker container?
To further verify that it is in fact calling the celery worker instance from the worker container, I stopped the worker container and repeated the python interactive shell example in the subscriber container. The request waited (which is expected of celery) and returned the same result as soon as the worker container is turned back on again. So IMO, yes, service from one container is calling an app instance from another container WITHOUT networking like in the case of connecting to redis (using IP address etc.).
Pls advise if my understanding is incorrect or there may be a wrong implementation somewhere which I am not aware of.
Both consumer (worker) and producer (subsriber) are configured to use Redis (redis) both as broker and result backend. That is why it all worked. - When you executed add.delay(2,3).get() in the subscriber container it sent the task to Redis, and it got picked by the Celery worker running in a different container.
Keep in mind that Python process running the add.delay(2,3).get() code is running in the subscriber container, while the ForkPoolWorker-2 process that executed the add() function and stored the result in the result backend is running in the worker container. These processes are completely independent.
The subscriber process did not call anything in the worker container! - In plain English what it did was: "here (in Redis) is what I need done, please workers do it and let me know you are done so that I can fetch the result".
Docker-compose creates a default docker network for containers created in a single file. Since you are pointing everything appropriately, it is making the requests along that network, which is why that is succeeding. I would be surprised to hear that this still worked if you were to, for example, run each container separately in parallel without using docker-compose.

Docker Compose attach one service to stdin and stdout

Something I'm trying to do is create a docker-compose application that has a single service act as a REPL that can interact with the rest of the services. I tried a variety of ways to get only this service attached to stdin and stdout but I haven't found anything elegant that worked. This stackoverflow post's answer said stdin_open: true and tty: true would work and here's what I made with it:
version: '3'
services:
redis:
image: redis
python:
image: python
entrypoint: /bin/sh
stdin_open: true
tty: true
Running docker-compose up still sends a log of both services and docker-compose up -d detaches both of the services. For this example is there an elegant way to get an interactive shell to the python service while only running docker-compose up ... (i.e. not running docker exec, etc)?
You can docker-compose run an alternate command using the image: and other settings in a Docker Compose YAML file. If that service depends_on: other services, it will start them. The one thing to be aware of is that it will not by default publish the declared ports:.
docker-compose run python /bin/sh
(The Docker setup tends to be a little more optimized around long-running network server processes, like the Redis installation here, and less for "commands" that rely on their stdin for input. Consider packaging your application into an image, but generally using host tools for learning a language and day-to-day development. For Python in particular, a virtual environment gives a self-contained playground where you can install packages, as your user account, without interfering with the system Python.)

docker-compose conditionally build containers

Our team is new to running a micro-service ecosystem and I am curious how one would achieve conditionally loading docker containers from a compose, or another variable-cased script.
An example use-case would be.
Doing front-end development that depends on a few different services. We will label those DockerA/D
Dependency Matrix
Feature1 - DockerA
Feature2 - DockerA and DockerB
Feature3 - DockerA and DockerD
I would like to be able to run something like the following
docker-compose --feature1
or
magic-script -service DockerA -service DockerB
Basically, I would like to run the command to conditionally start the APIs that I need.
I am already aware of using various mock servers for UI development, but want to avoid them.
Any thoughts on how to configure this?
You can stop all services after creating them and then selectively starting them one by one. E.g.:
version: "3"
services:
web1:
image: nginx
ports:
- "80:80"
web2:
image: nginx
ports:
- "8080:80"
docker-compose up -d
Creating network "composenginx_default" with the default driver
Creating composenginx_web2_1 ... done
Creating composenginx_web1_1 ... done
docker-compose stop
Stopping composenginx_web1_1 ... done
Stopping composenginx_web2_1 ... done
Now any service can be started using, e.g.,
docker-compose start web2
Starting web2 ... done
Also, using linked services, there's the scale command that can change the number of running services (can add containers without restart).

Feedback Request - Restarting Docker container on file change

I have this working but was wondering if there is any potential side effects or even a better way to do this. The example below is generic.
I have a docker-compose file with two containers (container_1 and container_2).
container_1 exposes a volume that contains various config files that it uses to run the installed service.
container_2 mounts the volume from container_1 and periodically runs a script that pulls files and updates the config of the service running in container_1.
Every time the configs are updated I want to restart the service in container_1 without having to use cron or some of the other methods I have seen discussed.
My solution:
I put a script on container_1 that checks if the config file has been updated (the file is initially empty and that md5sum is stored in a separate file) and if the file has changed based on md5sum it updates the current hash and kills the process.
In the compose file I have a healthcheck that runs the script periodically and restart is set to always. When the script in container_2 runs and updates the config files in container_1 the monitor_configs.sh the script on container_1 will kill the process of the service and the container will be restarted and reload the configs.
monitor_config.sh
# current_hash contains md5sum of empty file initially
#!/bin/sh
echo "Checking if config has updated"
config_hash=$(md5sum /path/to/config_file)
current_hash=$(cat /path/to/current_file_hash)
if [ "$rules_hash" != "$current_hash" ]
then
echo "config has been updated, restarting service"
md5sum /path/to/config_file > /path/to/current_file_hash
kill $(pgrep service)
else
echo "config unchanged"
fi
docker-compose.yml
version: '3.2'
services:
service_1:
build:
context: /path/to/Dockerfile1
healthcheck:
test: ["CMD-SHELL", "/usr/bin/monitor_config.sh"]
interval: 1m30s
timeout: 10s
retries: 1
restart: always
volumes:
- type: volume
source: conf_volume
target: /etc/dir_from_1
service_2:
build:
context: /path/to/Dockerfile2
depends_on:
- service_1
volumes:
- type: volume
source: conf_volume
target: /etc/dir_from_1
volumes:
conf_volume:
I know this is not the intended use of healthcheck but it seemed like the cleanest way to get the desired effect while still maintaining only one running process in each container.
I have tried with and without tini in container_1 and it seems to work as expected in both cases.
I plan on extending the interval of the healthcheck to 24 hours as the script in container_2 only runs once a day.
Use case
I'm running Suricata in container_1 and pulledpork in container_2 to update the rules for Suricata. I want to run pulledpork once a day and if the rules have been update, restart Suricata to load the new rules.
You may want to look at how tools like confd work, which would run as your container_1 entrypoint. It runs in the foreground, polls an external configuration source, and upon a change it rewrites the config files inside the container and restarts the spawned application.
To make your own tool like confd you'd need to include your restart trigger, maybe your health monitoring script, and then make the stdin/stdout/stderr pass through along with any signals so that your restart tool becomes transparent inside the container.

Resources