`docker run` as Prefect task - docker

My actual workloads that should be run as tasks within a Prefect flow are all packaged as docker images. So a flow is basically just "run this container, then run that container".
However, I'm unable to find any examples of how I can easily start a docker container as task. Basically, I just need docker run from a flow.
I'm aware of https://docs.prefect.io/api/latest/tasks/docker.html and tried various combinations of CreateContainer and StartContainer, but without any luck.

Using the Docker tasks from Prefect's Task Library could look something like this for your use case:
from prefect import task, Flow
from prefect.tasks.docker import (
CreateContainer,
StartContainer,
GetContainerLogs,
WaitOnContainer,
)
create = CreateContainer(image_name="prefecthq/prefect", command="echo 12345")
start = StartContainer()
wait = WaitOnContainer()
logs = GetContainerLogs()
#task
def see_output(out):
print(out)
with Flow("docker-flow") as flow:
container_id = create()
s = start(container_id=container_id)
w = wait(container_id=container_id)
l = logs(container_id=container_id)
l.set_upstream(w)
see_output(l)
flow.run()
This snippet above will create a container, start it, wait for completion, retrieve logs, and then print the output of echo 12345 to the command line.
Alternatively you could also use the Docker Python client directly in your own tasks https://docker-py.readthedocs.io/en/stable/api.html#module-docker.api.container

Related

Is it possible to run a command in a Docker image as a test in Bazel?

I would like to run a command inside a container to test that it works. It should be invoked by bazel test.
Something like this:
container_test(
image = "//:my_image"
test_command = "exit 1"
)
I noticed this: https://github.com/bazelbuild/rules_docker/blob/master/contrib/test.bzl#L125
However it isn't documented.
How should I approach this in Bazel?
Take a look at the sample test rule here
This is a test rule which creates a script (script) that can be invoked in the CLI
The script will then exit with a non-zero error-code to indicate that the test failed (or 0 for success)
The script is then written as an executable output (ctx.actions.write), declares the list of files it needs available at runtime (runfiles)
This python function is then wrapped as a bazel rule (see full guide here)
So, how would you proceed towards creating your container test rule?
The script we want to generate above is probably some usage of docker run --rm IMAGE [COMMAND] [ARG...] to create a container from an image, run a command, and remove the container when done
Don't forget to set the script exit status based on the exit status of the docker command (as done in the example, where they copy the exit status of grep as the exit status for the overall script)
Update the sample above to use the above docker command, and plant the path to the image accordingly
See f.path in the script above showing how they access the path of an individual source file
You will need to make sure docker is available when your bazel rules are evaluated
I haven't done this fully myself since I don't have a computer with both bazel and docker, but this should be enough to get you started :)
Good luck!

Limit the docker containers executed in a loop

I am very new to docker, have created a Dockerfile to create an image that executes the protractor tests.
That Dockerfile has an entry point that expects a parameter with Suite Name that I want to execute.
It all runs very well when I provide suite names in the command line.
I have about 30 test suites, I am using another .sh file which filters out suite names, and in a for loop, it runs docker commands with different suite names.
Now I do not want to sun 30 suites simultaneously but want to set a limit of say 6 at a time and want to keep others waiting until one is finished.
I execute like this:
for (( i=0; i<${tests}; i++ ));
do
docker run -dit containername $testSuiteName
done
So how can I limit the maximum number of executions at a time?
There are going to be a number of ways of tackling this problem. Here
is one possible solution.
You can treat this as a shell scripting problem, rather than a Docker problem. For example, consider the following, which instead of docker run ... just uses sleep:
#!/bin/sh
let count=5
let tests=20
for (( i=0; i<tests; i++ )); do
sleep $((RANDOM % 10)) &
echo "started $!"
let count--
if (( count <= 0 )); then
wait -n
let count++
fi
done
echo "waiting for remaining jobs"
wait
echo "all done"
This starts $count processes in parallel, and then waits for one to
exit. When a process exits, it immediately starts a new one. Once it
has started all the jobs, it simply waits for everything to finish.
Using this model, you would drop the -d from your docker run
command line, since you need the shell to track the background
processes. Instead of sleep ... &, you would run:
docker run containername $testSuiteName > /dev/null 2>&1 &
Note that I've dropped the -i and -t options here, since it looks like
you're running these tests non-interactively.
A few more possible solutions:
If you run your tests in jenkins, you can create a job that runs one test suite. Set the max number of executors to 6. And start as many tests as you want. Now jenkins won't let run more than 6 jobs at a time.
This is I think the ideal and most correct approach, yet most difficult one. You can use an orchestrator as kubernetes. This actually controls all your docker image. Unfortunately, I don't have step by step guide how to achieve. But this is really the most professional way to tackle your problem

How to execute a prefect Flow on a docker image?

My goal:
I have a built docker image and want to run all my Flows on that image.
Currently:
I have the following task which is running on a Local Dask Executor.
The server on which the agent is running is a different python environment from the one needed to execute my_task - hence the need to run inside a pre-build image.
My question is: How do I run this Flow on a Dask Executor such that it runs on the docker image I provide (as environment)?
import prefect
from prefect import task, Flow
from prefect.engine.executors import LocalDaskExecutor
from prefect.environments import LocalEnvironment
#task
def hello_task():
logger = prefect.context.get("logger")
logger.info("Hello, Docker!")
with Flow("My Flow") as flow:
results = hello_task()
flow.environment = LocalEnvironment(
labels=[], executor=LocalDaskExecutor(scheduler="threads", num_workers=2),
)
I thought that I need to start the server and the agent on that docker image first (as discussed here), but I guess there can be a way to simply run the Flow on a provided image.
Update 1
Following this tutorial, I tried the following:
import prefect
from prefect import task, Flow
from prefect.engine.executors import LocalDaskExecutor
from prefect.environments import LocalEnvironment
from prefect.environments.storage import Docker
#task
def hello_task():
logger = prefect.context.get("logger")
logger.info("Hello, Docker!")
with Flow("My Flow") as flow:
results = hello_task()
flow.storage = Docker(registry_url='registry.gitlab.com/my-repo/image-library')
flow.environment = LocalEnvironment(
labels=[], executor=LocalDaskExecutor(scheduler="threads", num_workers=2),
)
flow.register(project_name="testing")
But this created an image which it then uploaded to the registry_url provided. Afterwards when I tried to run the registered task, it pulled the newly created image and the task is stuck in status Submitted for execution for minutes now.
I don't understand why it pushed an image and then pulled it? Instead I already have an image build on this registry, I'd like to specify an image which should be used for task execution.
The way i ended up achieve this is as follows:
Run prefect server start on the server (i.e. not inside docker).
Apparently docker-compose in docker is not a good idea.
Run prefect agent start inside the docker image
Make sure the flows are accessible by the docker image (i.e. by mounting a shared volume between the image and the server for
example)
You can see the source of my answer here.

Explanation of Container From Scratch

I am learning about containers and docker in particular. I just watched this Liz Rice video in which she created a container from scratch (repo is on github.com/lizrice). I wasn't able to follow it completely as I am new to Docker and containers and I don't know Go programming language. However, I wanted to see if someone could give me a very quick explanation of what these items in the code are/trying to accomplish:
package main
import (
"fmt"
"io/ioutil"
"os"
"os/exec"
"path/filepath"
"strconv"
"syscall"
)
// go run main.go run <cmd> <args>
func main() {
switch os.Args[1] {
case "run":
run()
case "child":
child()
default:
panic("help")
}
}
func run() {
fmt.Printf("Running %v \n", os.Args[2:])
cmd := exec.Command("/proc/self/exe", append([]string{"child"}, os.Args[2:]...)...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.SysProcAttr = &syscall.SysProcAttr{
Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID | syscall.CLONE_NEWNS,
Unshareflags: syscall.CLONE_NEWNS,
}
must(cmd.Run())
}
func child() {
fmt.Printf("Running %v \n", os.Args[2:])
cg()
cmd := exec.Command(os.Args[2], os.Args[3:]...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
must(syscall.Sethostname([]byte("container")))
must(syscall.Chroot("/home/liz/ubuntufs"))
must(os.Chdir("/"))
must(syscall.Mount("proc", "proc", "proc", 0, ""))
must(syscall.Mount("thing", "mytemp", "tmpfs", 0, ""))
must(cmd.Run())
must(syscall.Unmount("proc", 0))
must(syscall.Unmount("thing", 0))
}
func cg() {
cgroups := "/sys/fs/cgroup/"
pids := filepath.Join(cgroups, "pids")
os.Mkdir(filepath.Join(pids, "liz"), 0755)
must(ioutil.WriteFile(filepath.Join(pids, "liz/pids.max"), []byte("20"), 0700))
// Removes the new cgroup in place after the container exits
must(ioutil.WriteFile(filepath.Join(pids, "liz/notify_on_release"), []byte("1"), 0700))
must(ioutil.WriteFile(filepath.Join(pids, "liz/cgroup.procs"), []byte(strconv.Itoa(os.Getpid())), 0700))
}
func must(err error) {
if err != nil {
panic(err)
}
}
In particular, my understanding of a container is that it is a virtualized run-time environment where users can isolate applications from the underlying system and that containers are only isolated groups of processes running on a single host, which fulfill a set of “common” features. I have a good sense of what a container is and trying to accomplish in a broader sense, but I wanted help to understand a specific example like this. If someone understands this well -What is being imported in the import block; what are the cases for in the main function; what is the use of the statement in the run function, and what is being accomplished by the child and cg functions?
I think with my current understanding and going through Docker tutorial, plus an explanation of a real code from scratch example would be extremely beneficial. Just to confirm - this code is not related to Docker itself outside of the code creates a container and Docker is a technology that makes creating containers easier.
She is creating a sort of container by doing this:
she will execute main.go and pass a command to be executed in the container
to do this she runs a process that executes the run() function
in the run() function she prepares a process to be forked that will execute the child() function
but before actually forking, via syscall.SysProcAttr, she configures a new namespace for:
"unix timesharing" (syscall.CLONE_NEWUTS) this essentially will allow to have a separate hostname in the child process
PIDs (syscall.CLONE_NEWPID) such that in the "container" she is creating she will have new PIDs starting from 1
mounts (syscall.CLONE_NEWNS) will enable the "container" to have separate mounts
next she executes the fork (cmd.Run())
in the forked process the child() function is executed an here:
she prepares a control group via cg() that will limit the resources available to the "container", this is done by writing some proper files in the /sys/fs/cgroup/
next she prepares the command to be executed by using the args passed to main.go
she uses chroot to a new root under /home/liz/ubuntufs
she monuts the special fs proc and another temporary fs
finally she executes the command provided as args to main.go
in the video containers from scratch she presents all of this very well.
There she executes a bash in the container that sees new PIDs, has a new hostname, and is limited to 20 processes.
To make it work she needed a full ubuntu fs clone under /home/liz/ubuntufs.
The 3 key points to take home are that a continer (well her "container") essentially does this:
uses namespaces to define what the container will see in terms of
PIDs/mounts (she did not handle networking in this container example)
uses chroot to restrict the container to a portion of the filesystem
uses cgroups to limit resources the container can use
Due to my lack of experience in GO & limited experience with custom docker containers, I can not confirm what this code does.
While this is not directly answering the question in the title, I want to provide an answer that helps you learn the basics in docker to get you started.
You're understanding of containers is correct. Try to find a tutorial that uses a simpler example in a language you're familiar with .
One simple example to get you started would be to create a container of your preferred linux OS, attach the docker container to your current terminal then run few OS specific commands within the container (such as installing a software inside the container or any linux command) .

Looking for a convenient way to start and stop applications with docker-compose

For each of my projects, I have configured a docker development environment consisting of several containers. I often switch between projects. That requires stopping one set of containers and starting another. I currently do it like this:
$ cd project1
$ docker-compose stop
$ cd ../project2
$ docker-compose up -d
So I need to remember which application is currently running, cd into the directory where its docker-compose.yml is, stop it, then remember what other project I want to run, cd there and start it.
Is there a better way? Like a utility that remembers which multicontainer applications I have, can stop the currently running one and run another one without manual cding and docker-composeing?
(By the way, what's the correct term for a set of containers hosting parts of a single application?)
Hope docker-compose-ui will help you in managing applications.
I think the real problem here is this:
That requires stopping one set of containers and starting another.
You shouldn't need to stop one project to start another.
Instead of mapping to the same host ports I would not map any ports at all. Then use a script to lookup the IP of the container, and connect directly to that:
#!/bin/bash
cip=$(docker inspect -f '{{range $key, $value := .NetworkSettings.Networks}} {{ $value.IPAddress}} {{end}}' $1)
This will look up the container ip. Combine that with a command to open the url:
url=http://cip:8080/
xdg-open $url || open $url
All together this will let you run the application without having to map any host ports. When host ports don't exist, you don't have to stop other projects.
If you are ruby proven a bit, you can use scaffolding for this.
A barebone example using thread ( to start different docker-compose session without one process and then stop them all together )
require 'docker-compose'
threads = []
project_paths = %w(/project/path1 /project/path2 /project/path3 /project/path)
project_paths.each do |path|
threads.push Docker::Compose::Session.new(dir:compose_base_path1)
end
begin
threads.each do |thread|
thread.join
end
rescue SystemExit, Interrupt
threads.each do |thread|
thread.kill
end
rescue Exception => e
handle_exception e
end
source
It uses
docker-compose gem
threads
Just set project_paths to the folders of your projects. And if you want to end them all, use CTRL+c
You can of course go beyond that, using a daemon and try to start / stop some of them giving "names" and such, but i guess as a starting point for scaffolding, that should be enaugh

Resources