container_image full docker image tarball - docker

As mentioned here, the full docker image tarball is created as an implicit output on demand.
But, what does it mean by on demand? how can I inform the build to produce it?
Seems a lack of documentation here.

You can specify the "image.tar" target name explicitly to build it; for instance, if your container_image target is named my_image in //foo, this will produce the tar archive in bazel-out:
bazel run //foo:my_image.tar

Usually a Docker image is stored within the Docker daemon’s private filesystem space, in a complex format. In ordinary use you’d use docker build, pull, run, and rmi to interact with an image, but none of these result in a file the way building an executable would.
There is an alternate path to managing images using docker save to export an image to a tar file, and docker load to reload it. This is awkward and has a number of disadvantages, and you almost always want to set up a registry or use a hosted registry like Docker Hub given the option.
My read of the documentation you link to is that the Bazel task you link to by default will just run docker build; but, if something downstream of it demands an image.tar file, it will run docker save too. It’s not usually needed and Bazel won’t run it if it’s not.

Related

Dockerfile save image to ssh host

I'm trying to deploy a docker image that is an asp.net core (.NET6) WebApi to ssh server.
I know the command for transferring the image file is:
docker save <my_image_namee> | ssh -C user#address docker load
Is it possible to execute this command within the Dockerfile right after building the image?
A Dockerfile can never run commands on the host, push its own image, save its output to a file, or anything else. The only thing it's possible to do in a Dockerfile is specify the commands needed to build the image within its isolated environment.
So, for example, in your Dockerfile you can't specify the image name that will be used (or its tag), or forcibly push something to a registry, or the complex docker save | ssh sequence you show. There's no option in the Dockerfile to do it.
You must run this as two separate commands using pretty much the syntax you show as your second command. If the two systems are on a shared network, a better approach would be to set up a registry server of some sort and docker push the image there; the docker save...docker load sequence isn't usually a preferred option, unless the two systems are on physically isolated networks. Whatever you need to do after you build the image, you could also consider asking your continuous-integration system to do it for you to avoid the manual step.

How to include files outside of build context without specifying a different Dockerfile path?

This is basically a follow-up question to How to include files outside of Docker's build context?: I'm using large files in all of my projects (several GBs) which I keep on an external drive, only used for development.
I want to COPY or ADD these files to my docker container when building it. The answer linked above allows one to specify a different path to a Dockerfile, potentially extending the build context. I find this unpractical, since this would require setting the build context to system root (?), to be able to include a single file.
Long story short: Is there any way or workaround to include a file that is far removed from the docker build context?
Three suggestions on things you could try:
include a file that is far removed from the docker build context?
You could construct your own build context by cp (or tar) files on the host into a dedicated directory tree. You don't have to use the actual source tree or your build tree.
rm -rf docker-build
mkdir docker-build
cp -a Dockerfile build/the-binary docker-build
cp -a /mnt/external/support docker-build
docker build ./docker-build
# reads docker-build/Dockerfile, and the files in the
# docker-build directory, but nothing else; only sends
# the docker-build directory to Docker as the build context
large files [...] (several GBs)
Docker doesn't deal well with build contexts this large. In the past I've at least seen docker build take a long time just on the step of sending the build context to itself, and docker push and docker pull have network issues when trying to send the gigabyte+ layer around.
It's a little hacky and breaks the "self-contained image" model a little bit, but you can provide these files as a Docker bind-mount instead of including them in the image. Your application needs to know what to do if the data isn't there. When you go to deploy the application, you also need to separately distribute the files alongside the Docker image and other deployment artifacts.
docker run \
-v /mnt/external/support:/app/support
...
the-image-without-the-support-files
only used for development
Potentially you can get away with not using Docker at all during this phase of development. Use a local source tree and local development tools; run your unit tests against these large test fixtures as needed. Build a Docker image only when you're about to run pre-commit integration tests; that may be late enough in the development cycle that you don't need these files.
I think the main thing you are worried about is that you do not want to send all files of a directory to docker daemon while it builds the image.
When directory was so big (in GBss) it takes lot of time to build an image.
If the requirement is to just use those files while you build anything inside docker, you can mount those to the container.
A tricky way
Run a container with base image and mount the direcotries inside it. docker run -d -v local-path:container-path
Get inside the container docker exec -it CONTAINER_ID bash
Run build step ./build-something.sh
Create image from the running container docker commit CONTAINER_ID
Tag the image docker tag IMAGE_ID tag:v1. You can get Image ID from previous command
From long term perspective this method may seem to be very tedious, but if you want to build image for 1 or 2 times , you can try this method.
I tried this for one of my docker image, as I want to avoid large amount of files sent to docker daemon during image build
The copy command gets source and destination values,
just specify full absolute path to your hard drive mount point as the src directory
COPY /absolute_path/to/harddrive /container/path

How to run a tensorflow/tfx container?

I am new to docker, and have downloaded the tfx image using
docker pull tensorflow/tfx
However, I am unable to find anywhere how to successfully launch a container for the same.
here's a naive attempt
You can use docker image ls to get a list of locally-built Docker images. Note that an "image" is a template for a VM.
To instantiate the VM and shell into it, you would use a command like docker run -t --entrypoint bash tensorflow/tfx. This spins up a temporary VM based on the tensorflow/tfx image.
By default, Docker assumes you want the latest version of that image stored on your local machine, i.e. tensorflow/tfx:latest in the list. If you want to change it, you can reference a specific image version by name or hash, e.g. docker run -t --entrypoint bash tensorflow/tfx:1.0.0 or docker run -t --entrypoint bash fe507176d0e6. I typically use the docker image ls command first and cut & paste the hash, so my notes can be specific about which build I'm referencing even if I later edit the relevant Dockerfile.
Also note that changes you make inside that VM will not be saved. Once you exit the bash shell, it goes away. The shell is useful for checking the state & file structure of a constructed image. If you want to edit the image itself, use a Dockerfile. Each line of a Dockerfile creates a new image when the Dockerfile is compiled. If you know that something went wrong between lines 5 and 10 of the Dockerfile, you can potentially shell into each of those images in turn (with the docker run command I gave above) to see what went wrong. Kinda tedious, but it works.
Also note that docker run is not equivalent to running a TFX pipeline. For the latter, you want to look into the TFX CLI commands or otherwise compile the pipeline - and probably upload it to an external Kubeflow server.
Also note that the Docker image is just a starting point for one piece of your TFX pipeline. A full pipeline will require you to specify the components you want, a more-complete Dockerfile, and more. That's a huge topic, and IMO, the existing documentation leaves a lot to be desired. The Dockerfile you create describes the image which will be distributed to each of the workers which process the full pipeline. It's the place to specify dependencies, necessary files, and other custom setup for the machine. Most ML-relevant concerns are handled in other files.

Is it possible to run a Dockerfile without building an image (or to immediately/transparently discard the image)?

I have a use case where I call docker build . on one of our build machines.
During the build, a volume mount from the host machine is used to persist intermediate artifacts.
I don't care about this image at all. I have been tagging it with -t during the build and calling docker rmi after it's been created, but I was wondering if there was a one-liner/flag that could do this.
The docker build steps don't seem to have an appropriate flag for this behavior, but it may be simply because build is the wrong term.

Is it possible to specify a custom Dockerfile for docker run?

I have searched high and low for an answer to this question. Perhaps it's not possible!
I have some Dockerfiles in directories such as dev/Dockerfile and live/Dockerfile. I cannot find a way to provide these custom Dockerfiles to docker run. docker build has the -f option, but I cannot find a corresponding option for docker run in the root of the app. At the moment I think I will write my npm/gulp script to simply change into those directories, but this is clearly not ideal.
Any ideas?
You can't - that's not how Docker works:
The Dockerfile is used with docker build to build an image
The resulting image is used with docker run to run a container
If you need to make changes at run-time, then you need to modify your base image so that it can take, e.g. a command-line option to docker run, or a configuration file as a mount.

Resources