Docker Ports and Airflow Tasks Test - docker

I'd like to ask what's the meaning of this PORT setup?
enter image description here
Sometimes I get the "5555/tcp, 8793/tcp" part and sometimes do not. So what is the function of "5555/tcp, 8793/tcp" and how does it appear when I build my Docker container?
Second question, which Docker container should I execute if I want to use command "airflow tasks test"?
Thanks
I expect to understand Docker container better especially how to use "airflow tasks test" command

Related

Executing a local/host command as an user from the host as well, in an airflow docker container

I have to execute some maprcli commands on a daily basis, and the maprcli command needs to be executed with a special user. The maprcli command and the user are both on the local host.
To schedule this tasks I need to use airflow, which further on works in a docker container. I am facing 2 problems here:
the maprcli is not available in the airflow docker conainer
the user with whom it should be executed is not available in the container.
The first problem can be solved with a volume mapping, but is there maybe a cleaner solution?
Is there any way to use the needed local/host user during the execution of a python script inside the airflow docker container?
The permissions depend on the availability of a mapr ticket that is normally generated by maprlogin.
Making this work correctly is much easier in Kubernetes than in bare docker containers because of the more advanced handling of tickets.

Is there a point in Docker start?

So, is there a point in the command "start"? like in "docker start -i albineContainer".
If I do this, I can't really do anything with the albine inside the container, I would have to do a run and create another container with the "-it" command and "sh" after (or "/bin/bash", don't remember it correctly right now).
Is that how it will go most of the times? delete and rebuilt containers and do the command "-it" if you want to do stuff in them? or would it more depend on the Dockerfile, how you define the cmd.
New to Docker in general and trying to understand the basics on how to use it. Thanks for the help.
Running docker run/exec with -it means you run the docker container and attach an interactive terminal to it.
Note that you can also run docker applications without attaching to them, and they will still run in the background.
Docker allows you to run a program (which can be bash, but does not have to be) in an isolated environment.
For example, try running the jenkins docker image: https://hub.docker.com/_/jenkins.
this will create a container, without you having attach to it, and you would still be able to use it.
You can also attach to an existing, running container by using docker exec -it [container_name] bash.
You can also use docker logs to peek at the stdout of a certain docker container, without actually attaching to its shell interactively.
You almost never use docker start. It's only possible to use it in two unusual circumstances:
If you've created a container with docker create, then docker start will run the process you named there. (But it's much more common to use docker run to do both things together.)
If you've stopped a container with docker stop, docker start will run its process again. (But typically you'll want to docker rm the container once you've stopped it.)
Your question and other comments hint at using an interactive shell in an unmodified Alpine container. Neither is a typical practice. Usually you'll take some complete application and its dependencies and package it into an image, and docker run will run that complete packaged application. Tutorials like Docker's Build and run your image go through this workflow in reasonable detail.
My general day-to-day workflow involves building and testing a program outside of Docker. Once I believe it works, then I run docker build and docker run, and docker rm the container once I'm done. I rarely run docker exec: it is a useful debugging tool but not the standard way to interact with a process. docker start isn't something I really ever run.

Tasks not starting for Airflow running inside a container

I'm trying to get Airflow up and running within a container and used the image available here. I found that although the DAG gets into running state (on the UI), the tasks within the DAG seem to be waiting indefinitely and never actually get triggered.
Given that some of the steps given in the documentation are optional, I followed these steps in order to get the example DAGs up and running within my container -
Pulled the image from dockerhub
docker pull puckel/docker-airflow
Triggered Airflow with default settings, which should start it with Sequential Executor
docker run -d -p 8080:8080 -e LOAD_EX=y puckel/docker-airflow
I'm relatively new to setting up Airflow and dockers, although I have worked on Airflow in the past. So, it's possible that I am missing something very basic here, since no one else seems to be facing the same issue. Any help would be highly appreciated.
The sequential executor is not a scheduler, so it only runs jobs that are created manually, from the UI or the run command. Certain kinds of tasks won't run in the sequential executor, I think its SubdagOperators that won't. While honestly it should be picking up dummy, bash, or python tasks, you may save time figuring it out if you run the scheduler and the local executor and a db. Puckel has an example docker compose file, https://github.com/puckel/docker-airflow

Using docker swarm to execute singular containers rather than "services"

I really enjoy the concept of having a cluster of docker machines available to execute docker services. I also like the additional features not available to singular docker containers (such as docker secret).
But I really have no need for long-standing services. My use case is to simply execute a bash script to use the docker swarm to take in an arbitrary number of finite commands, and execute each as a running docker container on the same docker image, while using the secrets loaded up with docker swarm's secrets.
Can I do this?
I do not want to have this container be "long running". I want it to run, and then exit with the output when the bash script loaded into the container is finished.
You can apply the ideas presented in "One-shot containers on Docker Swarm" from alex ellis.
You still neeeds to create a service, but with the right restart policy.
For instance, for a quick web server:
docker service create --restart-condition=none --name crawler1 -e url=http://blog.alexellis.io -d crawl_site alexellis2/href-counter
(--restart-condition, not --restart-policy, as commented by ethergeist)
So by setting a restart condition of 0, the container will be scheduled somewhere in the swarm as a (task). The container will execute and then when ready - it will exit.
If the container fails to start for a valid reason then the restart policy will mean the application code never executes. It would also be ideal if we could immediately return the exit code (if non-zero) and the accompanying log output, too.
For the last part, use his tool: alexellis/jaas.
Run your first one-shot container:
# jaas -rm -image alexellis2/cows:latest
The -rm flag removes the Swarm service that was used to run your container.
The exit code from your container will also be available, you can check it with echo $?.

Triggering docker run from another docker container

I have two docker images, lets say
Image 1: This has an public API built on Python Flask
Image 2: This some functional tests written Python
I am looking for an option where the API in Image 1 container is posted with a specific param then the Image1 container should trigger a docker run of Image2.
Is this should to trigger a docker run from a docker container.
Thanks
You are talking about using Docker in Docker
Check out this blog post for more info about how it works:
https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/
in short, you need to mount the docker socket as a volume (and now with docker 1.10, its dependencies as well)
then you can run docker in docker.
but it seems like what you are trying to do does not necessarily require that. you should rather look into making your 'worker' API an actual HTTP API that you can run and call an endpoint for to trigger the parametrized work. That way you run a container that waits for work requests and run them, without running a container each time you need a task done.

Resources