Dockerfile build error and writes to another folder - docker

I made a dockerfile like this
FROM hyeshik/tailseeker:latest
RUN rm /opt/tailseeker/conf/defaults.conf
COPY /Users/Downloads/defaults.conf /opt/tailseeker/conf/
COPY /Users/Downloads/level2/* /opt/tailseeker/refdb/level2/
COPY /Users/Downloads/level3/* /opt/tailseeker/refdb/level3/
My /Users/Downloads/ folder also has other folders named input
When I ran
docker build -f /Users/Downloads/Dockerfile /Users/Downloads/
I get an error saying
Sending build context to Docker daemon 126.8 GB
Error response from daemon: Error processing tar file(exit status 1): write /input/Logs/Log.00.xml: no space left on device
One strange thing here is why is it trying to write to the input folder? And the other one is why does it complain about no space left on device. I have a 1TB disk and only 210GB of it is used. I also used qemu-img and resized my Docker.qcow2. Here is the info of my Docker.qcow2
image:/Users/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2
file format: qcow2
virtual size: 214G (229780750336 bytes)
disk size: 60G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: true
refcount bits: 16
corrupt: false
Can anyone please help me to copy the contents from my /Users/Downloads folder into the docker image by using that dockerfile above?
Thanks in advance.

build starts by creating a tarball from the context directory (in your case /Users/Downloads/) and sending that tarball to the server. The tarball is created in the tmp directory, which is probably why you're running out of space when trying to build.
When you're working with large datasets the recommended approach is to use a volume. You can use a bind mount volume to mount the files from the host.
If the files you're trying to add aren't that large, you might need to use a .dockerignore to ignore other files under /Users/Downloads.
You can also start the docker daemon with an alternative temp directory using $DOCKER_TMPDIR

Related

How can I load a Docker image created from the .tar file of an original Docker image?

I have a Docker image in .tar format. When I load it using sudo docker load < image.tar, it works fine.
I used tar -xf image.tar to un-archive the file. I then un-archived each layer file so I could edit some scripts and update some libraries manually. Once I was done with this, I used tar -cf on each layer and then the entire image.
When I load the modified image the same way I loaded the original, it does not work. I get this error:
open /var/lib/docker/tmp/docker-import-742628246/image-edited/json: no such file or directory
What could I have done wrong to cause this error and how can I properly load the modified .tar file into Docker?
P.S.: The problem appeared on Docker 20.10.12, running on Kali Linux 2021.4 inside VMWare Workstation Player.
I am modifying some image tarball and trying to load it back to docker daemon, and saw the same error. I tried this: sha256sum the layer.tar in each layer, and search the config json file(the other json file at the same level as manifest.json, with a long sha value as filename) for the original hash, replace with the new sha256 value, then this error disappears.
But then again, another error like /var/lib/docker/tmp-xxxx/xxxxx cannot open file appears, then I think nothing should/can be modified in the layer tar, as we cannot bypass docker checking the integrity of image in any easy way.
I am planning to adding another layer to the image using Google jib tool, where I copy some script to the container which does the modification I want, so that original layers keep intact.

docker build running into GB size

I have a Cassandra.tar.gz file which I want to convert into an image. I created a DockerFile (CassandarImageDockerFile.txt) with the following contents
FROM scratch
add apache-cassandra-3.11.6-bin.tar /
Then I ran the following command but noticed that that image size was running into GB while the .tar is only 140MB. I Ctrl+c to stopped the command
C:\Users\manuc\Documents\manu>docker build -f CassandraImageDockerFile.txt .
Sending build context to Docker daemon 4.34GB
What happened under the hood? Why did the image size go in GB? What is the right way to build the image?
The last arg to the build command is the build context. All files that you add or copy to the image must be within that context. It gets sent to the docker engine and the build runs within a sandbox (temp folder and containers) using that context. In this case, the context path is . aka the current directory. So look in that folder and all child directories for files that will total many GB. You can exclude files from being sent in the context to the engine using the .dockerignore file, with a nearly identical syntax to the .gitignore file.
Following things to check here.
Size of base image,.i.e., scratch.
Size of build context - Check the directory from where you are building the image.
For example, docker image build -t xyz:1 .
Here, the build context is the content of the current folder.
So, while building the image, docker sends the build context to the daemon and which gets copied over to the image, which might be the reason of huge size.
So, check the content of the directory and see if you are adding any unnecessary files to your image.
I think the image you are starting from already is some Gb of size. Can you please check? See that scratch image on the FROM on the DockerFile

Why does it show "File not found" when I am trying to run a command from a docker file to find and remove specific logs?

I have a docker file which has below command.
#Kafka log cleanup for log files older than 7 days
RUN find /opt/kafka/logs -name "*.log.*" -type f -mtime -7 -exec rm {} \;
While executing it gives an error opt/kafka/logs not found. But I can access to that directory. Any help on this is appreciated. Thank you.
Changing the contents of a directory defined with VOLUME in your Dockerfile using a RUN step will not work. The temporary container will be started with an anonymous volume and only changes to the container filesystem are saved to the image layer, not changes to the volume.
The RUN step, along with every other step in the Dockerfile, are used to build the image, and this image is the input to the container, it does not use your running containers or volumes for the build input, so it makes no sense to cleanup files that are not created as part of your image build.
If you do delete files created in your image build, you should make sure this is done within the same RUN step. Otherwise, files you delete are already written to an image layer, and are transferred and stored on disk, just not visible in containers based on the layer that includes the delete step.

How can I edit an existing docker image metadata?

I would like to edit a docker images metadata for the following reasons:
I don't like an image parents EXPOSE, VOLUME etc declaration (see #3465, Docker-Team did not want to provide a solution), so I'd like to "un-volume" or "un-expose" the image.
I dont't like an image ContainerConfig (see docker inspect [image]) cause it was generated from a running container using docker commit [container]
Fix error durring docker build or docker run like:
cannot mount volume over existing file, file exists [path]
Is there any way I can do that?
Its a bit hacky, but works:
Save the image to a tar.gz file:
$ docker save [image] > [targetfile.tar.gz]
Extract the tar file to get access to the raw image data:
tar -xvzf [targetfile.tar.gz]
Lookup the image metadata file in the manifest.json file: There should be a key like .Config which contains a [HEX] number. There should be an exact [HEX].json in the root of the extracted folder.
This is the file containing the image metadata. Edit as you like.
Pack the extracted files back into an new.tar.gz-archive
Use cat [new.tar.gz] | docker load to re-import the modified image
Use docker inspect [image] to verify your metadata changes have been applied
EDIT:
This has been wrapped into a handy script: https://github.com/gdraheim/docker-copyedit
I had come across the same workaround - since I have to edit the metadata of some images quite often (fixing an automated image rebuild from a third party), I have create a little script to help with the steps of save/unpack/edit/load.
Have a look at docker-copyedit. It can remove or overrides volumes as well as set other metadata values like entrypoint and cmd.

What is the difference between save and export in Docker?

I am playing around with Docker for a couple of days and I already made some images (which was really fun!). Now I want to persist my work and came to the save and export commands, but I don't fully understand them.
What is the difference between save and export in Docker?
The short answer is:
save will fetch an image : for a VM or a physical server, that would be the installation .ISO image or disk. The base operating system.
It will pack the layers and metadata of all the chain required to build the image. You can then load this "saved" images chain into another docker instance and create containers from these images.
export will fetch the whole container : like a snapshot of a regular VM. Saves the OS of course, but also any change you made, any data file written during the container life. This one is more like a traditional backup.
It will give you a flat .tar archive containing the filesystem of your container.
Edit: as my explanation may still lead to confusion, I think that it is important to understand that one of these commands works with containers, while the other works with images.
An image has to be considered as 'dead' or immutable, starting 0 or 1000 containers from it won't alter a single byte. That's why I made a comparison with a system install ISO earlier. It's maybe even closer to a live-CD.
A container "boots" the image and adds an additional layer on top of it. This layer stores any change on the container (created/changed/removed files...).
There are two main differences between save and export commands.
save command saves whole image with history and metadata but export command exports only files structure (without history and metadata). So the exported tar file will be smaller then the saved one.
When you use exported file system for creating a new image then this new image will not contain any USER, EXPOSE, RUN etc. commands from your Dockerfile. Only file structure will be transferred.
So when you are using mentioned keywords in your Dockerfile then you cannot use export command for transferring image to another machine - you need always use save command.
export: container (filesystem)->image tar.
import: exported image tar-> image. Only one layer.
save: image-> image tar.
load: saved image tar->image. All layers will be recovered.
From Docker in Action, Second Edition p190.
Layered images maintain the history of the image, container-creation metadata, and old files that might have been deleted or overridden.
Flattened images contain only the current set of files on the filesystem.
The exported image will not have any layer or history information saved, so it will be smaller and you will not be able to rollback.
The saved image will have layer and history information, so larger.
If giving this to a customer, the Q is do you want to keep those layers or not?
Technically, save/load works with repositories which can be one or more of images, also referred to as layers. An image is a single layer within a repo. Finally, a container is an instantiated image (running or not).
Docker save Produces a tar file repo which contains all parent layers, and all tags + versions, or specified repo:tag, for each argument provided from image.
Docker export Produces specified file(can be tar or tgz) with flat contents without contents of specified volumes from Container.
docker save need to use on docker image while docker export need to use on container(just like running image)
Save Usage
docker save [OPTIONS] IMAGE [IMAGE...]
Save an image(s) to a tar archive (streamed to STDOUT by default)
--help=false Print usage -o, --output="" Write to a file,
instead of STDOUT
export Usage
docker export [OPTIONS] CONTAINER
Export the contents of a container's filesystem as a tar archive
--help=false Print usage -o, --output="" Write to a file,
instead of STDOUT

Resources