My goal:
I have a built docker image and want to run all my Flows on that image.
Currently:
I have the following task which is running on a Local Dask Executor.
The server on which the agent is running is a different python environment from the one needed to execute my_task - hence the need to run inside a pre-build image.
My question is: How do I run this Flow on a Dask Executor such that it runs on the docker image I provide (as environment)?
import prefect
from prefect import task, Flow
from prefect.engine.executors import LocalDaskExecutor
from prefect.environments import LocalEnvironment
#task
def hello_task():
logger = prefect.context.get("logger")
logger.info("Hello, Docker!")
with Flow("My Flow") as flow:
results = hello_task()
flow.environment = LocalEnvironment(
labels=[], executor=LocalDaskExecutor(scheduler="threads", num_workers=2),
)
I thought that I need to start the server and the agent on that docker image first (as discussed here), but I guess there can be a way to simply run the Flow on a provided image.
Update 1
Following this tutorial, I tried the following:
import prefect
from prefect import task, Flow
from prefect.engine.executors import LocalDaskExecutor
from prefect.environments import LocalEnvironment
from prefect.environments.storage import Docker
#task
def hello_task():
logger = prefect.context.get("logger")
logger.info("Hello, Docker!")
with Flow("My Flow") as flow:
results = hello_task()
flow.storage = Docker(registry_url='registry.gitlab.com/my-repo/image-library')
flow.environment = LocalEnvironment(
labels=[], executor=LocalDaskExecutor(scheduler="threads", num_workers=2),
)
flow.register(project_name="testing")
But this created an image which it then uploaded to the registry_url provided. Afterwards when I tried to run the registered task, it pulled the newly created image and the task is stuck in status Submitted for execution for minutes now.
I don't understand why it pushed an image and then pulled it? Instead I already have an image build on this registry, I'd like to specify an image which should be used for task execution.
The way i ended up achieve this is as follows:
Run prefect server start on the server (i.e. not inside docker).
Apparently docker-compose in docker is not a good idea.
Run prefect agent start inside the docker image
Make sure the flows are accessible by the docker image (i.e. by mounting a shared volume between the image and the server for
example)
You can see the source of my answer here.
Related
I'm new to Docker, and I'm not sure how to quite deal with this situation.
So I'm trying to run a docker container in order to replicate some results from a research paper, specifically from here: https://github.com/danhper/bigcode-tools/blob/master/doc/tutorial.md
(image link: https://hub.docker.com/r/tuvistavie/bigcode-tools/).
I'm using a windows machine, and every time I try to run the docker image (via: docker run -p 80:80 tuvistavie/bigcode-tools), it instantly closes. I've tried running other images, such as the getting-started, but that image doesn't close instantly.
I've looked at some other potential workarounds, like using -dit, but since the instructions require setting an alias/doskey for a docker run command, using the alias and chaining it with other commands multiple times results in creating a queue for the docker container since the port is tied to the alias.
Like in the instructions from the GitHub link, I'm trying to set an alias/doskey to make api calls to pull data, but I am unable to get any data nor am I getting any errors when performing the calls on the command prompt.
Sorry for the long question, and thank you for your time!
Going in order of the instructions:
0. I can run this, it added the image to my Docker Desktop
1.
Since I'm using a windows machine, I had to use 'set' instead of 'export'
I'm not exactly sure what the $ is meant for in UNIX, and whether or not it has significant meaning, but from my understanding, the whole purpose is to create a directory named 'bigcode-workspace'
Instead of 'alias,' I needed to use doskey.
Since -dit prevented my image from instantly closing, I added that in as well, but I'm not 100% sure what it means. Running docker run (...) resulted in the docker image instantly closing.
When it came to using the doskey alias + another command, I've tried:
(doskey macro) (another command)
(doskey macro) ^& (another command)
(doskey macro) $T (another command)
This also seemed to be using github api call, so I also added a --token=(github_token), but that didn't change anything either
Because the later steps require expected data pulled from here, I am unable to progress any further.
Looks like this image is designed to be used as a command-line utility. So it should not be running continuously, but you run it via alias docker-bigcode for your tasks.
$BIGCODE_WORKSPACE is an environment variable expansion here. So on a Windows machine it's %BIGCODE_WORKSPACE%. You might want to set this variable in Settings->System->About->Advanced System Settings, because variables set with SET command will apply to the current command prompt session only. Or you can specify the path directly, without environment variable.
As for alias then I would just create a batch file with the following content:
docker run -p 6006:6006 -v %BIGCODE_WORKSPACE%:/bigcode-tools/workspace tuvistavie/bigcode-tools %*
This will run the specified command appending the batch file parameters at the end. You might need to add double quotes if BIGCODE_WORKSPACE path contains spaces.
I am trying to run the containernet_example.py file (where I modified the 2 docker image hosts with my docker images) with ONOS as the controller for my topology.
When I accessed the ONOS UI page via localhost:8181/onos/ui/login.html I was not able to access the hosts, i.e. docker images in the UI page. I mean the topology is not displayed in the onos page but in the containernet CLI, the ping works for the hosts. The command I use is:
sudo mn --controller remote,ip=MYIPADDRESS --switch=ovsk,protocols=OpenFlow13 --custom containernet_example.py
Whereas if I try standard topologies like tree, I am able to access those topologies. I want to use those docker images as hosts in onos gui and as well as in containernet cli.
I have been reading so many posts but I could not solve this issue. Any insight would be helpful. Thanks in advance.
The code used for this has been acquired from another StackOverflow link, which I could not tag as I could not bookmark the page exactly.
Below is the code that worked for me in the case of 2 docker images as containernet hosts.
from mininet.net import Containernet
from mininet.node import Controller, OVSKernelSwitch, RemoteController
from mininet.cli import CLI
from mininet.link import TCLink
from mininet.log import info, setLogLevel
setLogLevel('info')
net = Containernet(controller=RemoteController, switch=OVSKernelSwitch) #remote controller
info('*** Adding controller\n')
net.addController('c0', controller=RemoteController, ip= 'MYIPADDRESS', port= 6653)
info('*** Adding docker containers\n')
d1 = net.addDocker('d1', ip='10.0.0.251', dimage="myimage1:myimagetag")
d2 = net.addDocker('d2', ip='10.0.0.252', dimage="myimage2:myimagetag")
info('*** Adding switches\n')
s1 = net.addSwitch('s1', protocols= "OpenFlow13") #mentioning protocol
info('*** Creating links\n')
net.addLink(d1, s1)
net.addLink(d2, s1)
info('*** Starting network\n')
net.start()
info('*** Testing connectivity\n')
net.ping([d1, d2,])
info('*** Running CLI\n')
CLI(net)
info('*** Stopping network')
net.stop()
And the command I used is simple sudo python3 myfilename.py
I am trying to push a docker image on Google Cloud Platform container registry to define a custom training job directly inside a notebook.
After having prepared the correct Dockerfile and the URI where to push the image that contains my train.py script, I try to push the image directly in a notebook cell.
The exact command I try to execute is: !docker build ./ -t $IMAGE_URI, where IMAGE_URI is the environmental variable previously defined. However I try to run this command I get the error: /bin/bash: docker: command not found. I also tried to execute it with the magic cell %%bash, importing the subprocess library and also execute the command stored in a .sh file.
Unfortunately none of the above solutions work, they all return the same command not found error with code 127.
If instead I run the command from a bash present in the Jupyterlab it works fine as expected.
Is there any workaround to make the push execute inside the jupyter notebook? I was trying to keep the whole custom training process inside the same notebook.
If you follow this guide to create a user-managed notebook from Vertex AI workbench and select Python 3, then it comes with Docker available.
So you will be able to use Docker commands such as ! docker build . inside the user-managed notebook.
Example:
My actual workloads that should be run as tasks within a Prefect flow are all packaged as docker images. So a flow is basically just "run this container, then run that container".
However, I'm unable to find any examples of how I can easily start a docker container as task. Basically, I just need docker run from a flow.
I'm aware of https://docs.prefect.io/api/latest/tasks/docker.html and tried various combinations of CreateContainer and StartContainer, but without any luck.
Using the Docker tasks from Prefect's Task Library could look something like this for your use case:
from prefect import task, Flow
from prefect.tasks.docker import (
CreateContainer,
StartContainer,
GetContainerLogs,
WaitOnContainer,
)
create = CreateContainer(image_name="prefecthq/prefect", command="echo 12345")
start = StartContainer()
wait = WaitOnContainer()
logs = GetContainerLogs()
#task
def see_output(out):
print(out)
with Flow("docker-flow") as flow:
container_id = create()
s = start(container_id=container_id)
w = wait(container_id=container_id)
l = logs(container_id=container_id)
l.set_upstream(w)
see_output(l)
flow.run()
This snippet above will create a container, start it, wait for completion, retrieve logs, and then print the output of echo 12345 to the command line.
Alternatively you could also use the Docker Python client directly in your own tasks https://docker-py.readthedocs.io/en/stable/api.html#module-docker.api.container
I was just going through this tutorail HERE about Docker images and to be more specific , it was about extending a docker image, now if you scroll to the section that says: Building an image from a Dockerfile , you'll see that a new Dockerfile is being built , now is this a independent image or is this Dockerfile extending the training/sinatra image ?? That would be by question.
So to repeat my question , is the Dockerfile in the Building an image from a Dockerfile section, creating a new image or extending the training/sinatra image ?
Thank you.
The command in that section is
docker build -t ouruser/sinatra:v2
That means it is creating a new image, extending the one mentioned in the Dockerfile: FROM ubuntu:14.04
The end result is:
a new image belonging to the user ouruser, the repository name sinatra and given it the tag v2.
each step creates a new container, runs the instruction inside that container and then commits that change - just like the docker commit work flow we saw earlier.
When all the instructions have executed we’re left with the 97feabe5d2ed image (also helpfully tagged as ouruser/sinatra:v2) and all intermediate containers will get removed to clean things up.
So again, this is an independent image, independent from training/sinatra.
To extends an image, you
either make a Dockerfile which starts "FROM <animage>", and build it: it will execute a series of docker commit on each intermediate containers.
or, and that is what is described in "Updating and committing an image", you do that manually, by running a bash, executing an order, exiting and committing the exited container into a new image.
The first approach scales better: you chain multiple commits specified in one Dockerfile.