Airflow run bash command on an existing docker container - docker

I have Airflow running in a Docker container and I want to trigger a python script that resides in another container. I tried the regular bash operator but that seems to be only for local. Also looked at the Docker operator but that one seems to want to create a new container.

The airflow container must be able to access the python script to be executed. If the script is in another container, either you mount a volume that airflow can access it or you can execute DAG with KubernetesPodOperator.

Related

Run an AirFlow task in another Docker container

I am considering implementing AirFlow and have no prior experience with it.
I have a VM with docker installed, and two containers running on it:
container with python environment where cronjobs currently run
container with an AirFlow installation
Is it possible to use AirFlow to run a task in the python container? I am not sure, because:
If I use the BashOperator with the command like docker exec mycontainer python main.py, I assume it will mark this task as success, even if the python script fails (it successfully run the command, but its resposibility ends there).
I see there is a DockerOperator, but it seems to take an image, create and run a container, but I want to run a task on a container that is already running.
The closest answer I found is using kubernetes here, which is overkill for my needs.
The BashOperator runs the bash command on:
the scheduler container if you use the LocalExecutor
one of the executors containers if you use the CeleryExecutor
a new separate pod if you use the KubernetesExecutor
While the DockerOperator is developed to create a new docker container on a docker server (local or remote server), and not to manage an existing container.
To run a task (command) on an existing container (or any other host), you can setup a ssh server within the python docker container, then use the sshOperator to run your command on the remote ssh server (the python container in your case).

How can I run script automatically after Docker container startup without altering main process of container

I have a Docker container which runs a web service. After the container process is started, I need to run a single command. How can I do this automatically, either by using Docker Compose or Docker?
I'm looking for a solution that does not require me to substitute the original container process with a Bash script that runs sleep infinity etc. Is this even possible?

is there a `docker up` command, like `vagrant up`?

Is there a docker command which works like the vagrant up command?
I'd like to use the arangodb docker image and provide a Dockerfile for my team without forcing my teammates to get educated on the details of its operation, it should 'just work'. Within the the project root, I would expect the database to start and stop with a standard docker command. Does this not exist? If so, why not?
Docker Compose could do it.
docker-compose up builds image, creates container and starts it.
docker-compose stop stops the container.
docker-compose start restarts the container.
docker-compose down stops the container and removes image and the container.
With Docker compose file you can configure the ArangoDB (expose ports, volume mapping for db initialisation, etc.). Place the compose file to the project root, and run the up command.

how to configure Cassandra.yaml which is inside docker image of cassandra at /etc/cassandra/cassandra.yaml

I am trying to edit cassandra.yaml which is inside docker container at /etc/cassandra/cassandra.yaml, I can edit it from logging inside the container, but how can i do it from host?
Multiple ways to achieve this from host to container. You can simple use COPY or RUN in Dockerfile or with basic linux commands as sed, cat, etc. to place your configuration into the container. Another way you can pass environment variables while running your cassandra image which will pass those environment variables to the spawning container. Also, can use the docker volume mount it from host to container and you can map the configuration you want into the cassandra.yaml as shown below,
$ docker container run -v ~/home/MyWorkspace/cassandra.yaml:/etc/cassandra/cassandra.yaml your_cassandra_image_name
If you are using Docker Swarm then you can use Docker configs to externally store the configuration files(Even other external services can be used as etcd or consul). Hope this helps.
To edit cassandra.yaml :
1) Copy your file from your Docker container to your system
From command line :
docker ps
(To get your container id)
Then :
docker cp your_container_id:\etc\cassandra\cassandra.yaml C:\Users\your_destination
Once the file copied you should be able to see it in your_destination folder
2) Open it and make the changes you want
3) Copy your file back into your Docker container
docker cp C:\Users\your_destination\cassandra.yaml your_container_id:\etc\cassandra
4) Restart your container for the changes to be effective

Docker CMD loop

I´m new on docker but i know that a docker container should have only one process. But is it possible to run a script inside of a docker container multiple times like by a cronjob?
For example I have a python script which manipulate my database. This process should be done every hour. For that i have created a container based on a file like that:
FROM python:slim
COPY ac.py ac.py
RUN pip install pymongo
CMD [ "python", "./ac.py" ]
If i load this container from my repository and start it on any environment the process is done only one time.
Is there any posibillity to start that like a cronjob (without use ubuntu image inside of my docker container)?
By the way I want to deploy this container in google cloud. Is there any cloud provider who provide a functionality like that?
You could leverage docker swam and create a service that will have restart condition set to any and a delay between restarts set to 1h.
docker service create --restart-condition any --restart-delay 1h myPythonImage:latest
See docker service create reference: https://docs.docker.com/engine/reference/commandline/service_create/#options

Resources