Backup + Version Docker Containers by adding Volumes and Commit - docker

We are about to "dockerize" our not-so-big infrastructure. One crucial question here is the whole backup / restore workflow, which is I think crucial for most enterprise but even private users.
I know about the export and save features of docker which will generate a tarball of a running container, which is neat because it can be done without shutting down the container.
So let's say we are running a container X and we have mounted some volumes:
-v /home/user/dockerapp-X/data:/var/www/html
-v /home/user/dockerapp-X/logs:/var/logs/app-x
-v /home/user/dockerapp-X/config:/etc/app-x
The biggest benefit of this is, if we update app-X we just have to pull the new image and restart the container.
But:
This way those directories wouldn't get backupped if we do docker-export or save.
So either we can just backup those directories extra, with rsync, backula or whatever. I guess this would be the "standart" way of backupping. But there is no guarantee and also no connection between the current version of the image and the data.
On a VM we would just make a snapshot to have the data and the app connected.
So the question is:
Is it a best practice to just make a Dockerfile with the current app-x version and copy the volumes in the image and build/push the whole image to our private repo?
so it would look like this:
FROM repo/app-x
COPY /home/user/dockerapp-X/data:/var/www/html
COPY /home/user/dockerapp-X/logs:/var/logs/app-x
COPY /home/user/dockerapp-X/config:/etc/app-x
then
docker build -t repo/infra/app-x:backup-v1-22.10.2016 .
docker push repo/infra/app-x:backup-v1-22.10.2016
This would mean that in our repo there is a snapshot for the current version of the app and the image contains all current data of the volumes.
So restoring would be:
docker run --name=backup-restored repo/infra/app-x:backup-v1-22.10.2016
And we could even mount the data folders locally on the host again:
docker run --name=backup-restored \
-v /home/user/dockerapp-X/data:/var/www/html
-v /home/user/dockerapp-X/logs:/var/logs/app-x
-v /home/user/dockerapp-X/config:/etc/app-x
repo/infra/app-x:backup-v1-22.10.2016
Will my data and my app have the correct data and app version?

Related

How to recover docker images and containers from a backup of /var/lib/docker

I had a corrupted OS of Ubuntu 16, and I wanted to backup all the docker things. Starting docker daemon outside fakeroot with --data-dir= didn't help, so I made a full backup of /var/lib/docker (with tar --xattrs --xattrs-include='*' --acls).
And in the fresh system (upgraded to Ubuntu 22.04), I extracted the tar, but found docker ps having empty output. I have the whole overlay2 filesystem and /var/lib/docker/image/overlay2/repositories.json, so there may be a way to extract the images and containers, but I couldn't find one.
Is there any way to restore them?
The backup worked actually, it was due to the docker installed during Ubuntu Server 22.04 installation process was ported by snap. After removing snap and installing a systemd version, the docker did recognize all the images and containers in overlayfs. Thanks for everyone!
For those who cannot start docker daemon for backup, you can try cp -a or tar --xattrs-include='*' --acls --selinux to copy the whole /var/lib/docker directory.
Probably not , as far as I have learned about docker it has stored your image in form different layers with different sha256 chunks.
Even when you try to transfer the images from one machine to another you would require online public/private repository to store and retrieve images or you have to zip the files from command line and then you can copy and paste it another location as single file.
Maybe from next time make sure you store all your important images to any of the online repository.
You can also refer different answers from this thread : How to copy Docker images from one host to another without using a repository

How to store all container's data in docker?

I am trying to execute ubuntu in docker. I use this command docker run -it ubuntu, and I want to install some packages and store some files. I know about volumes, but I have used it only in docker-compose. Is it possible to store all the container's data or how can I do that properly?
when you run a container, Docker creates a namespace and loads the image filesystem in that namespace. any changes you apply in a running container including installing some packages only remains during the lifetime of the container if you remove the container and rerun it they're gone.
if you want to your changes be permanent you have to commit the running container and actually create an image for that using this command:
As David pointed out in the comments
You should pretty much never run docker commit. It leads to images that can't be reproduced, and you'll be in trouble if there's a security fix you're required to take a year down the road.
sudo docker commit [CONTAINER_ID] [new_image_name]
if you have an app inside the container like MySQL and wants the data stored in that app be permanent you should map a volume from the host like this:
docker run -d -v /home/username/mysql-data:/var/lib/mysql --name mysql mysql

How docker detects which changes should be saved and which not?

I know that when we stop docker our changes are lost. There are many answers how to prevent this - commit each time. Idea is that when docker runs it will spin up a fresh container based on the image. On the other hand container persists some data after it exists unless you start using --rm.
Just to simplify:
If you run apt-get install vim, you must commit to save the change
BUT If you change nginx.conf or upload new file to HDFS, you do not lose the data.
So, just curious:
How docker knows what to save and what not? Ex: At the end of apt-get-install we have new files in the system. The same is when I upload new file. for the container/image there is NO difference , Right? Just I/O modification. So how docker know which modification should be saved when we stop the image?
The basic rules here:
Anything you explicitly store outside the container — a database, S3 — will outlive the container.
If you attach a volume to the container when you create the container using a docker run -v option or a Docker Compose volumes: option, any data written to that directory outlives the container. (If it’s a named volume, it lasts until you docker volume rm it.)
Anything else in the container filesystem is lost as soon as you docker rm the container.
If you need things like your application source code or a helper tool installed in an image, write a Dockerfile to describe how to build the image and run docker build. Check the Dockerfile into source control alongside your application.
The general theory of working with Docker is that you always start from a clean slate. When you docker build an image, you start from a base image and install your application into it; you never try to upgrade an installed application. Similarly, when you docker run a container, you start from a fresh copy of its image.
So the clearest answer to the question you ask is really, if you consistently docker rm a container when you stop it, when you docker run a new container, it will have the base image plus the content from the mounted volumes. Docker will never automatically persist anything outside of this.
You should never run docker commit: this leads to magic images that can’t be recreated later (in six months when you discover a critical security issue that risks taking your site down). Similarly, you should never install software in a running container, because it will be lost as soon as the container exits; add it to your Dockerfile and rebuild.
For any Container working with the Docker platform by default all the data generated is temporary and all the file generation or data generation is temporary and no data will persist if you have not mounted the filesystem part of if you have not attached volumes to the container.
IF you are finding that the nginx.conf is getting reused even after changes i would suggest try to find what directories are you trying to mount or mapped to the docker volumes.
The configurations for nginx which reside at /etc/nginx/conf.d/* and you might be mapping the volume with this directory. So if you make any changes in a working container and then remove the container the data will still persist as the data gets written to the writable layer. If the new container which you deploy later with the same volume mapping you will find all the changes you had initially done in the previous case are reflected in the newer container as well.

Updating a container created from a custom dockerfile

Before anything, I have read this question and the related links in it, but I am still confused about how to resolve this on my setup.
I wrote my own docker file to install Archiva, which is very similar to this file. I created an image from the docker file using docker build -t archiva . and have a container which I run using docker run archiva. As seen in the docker file, the user data that I want to preserve is in a volume.
Now I want to upgrade to Archive 2.2.0. How can I update my container, so that the user-data thats in the volume is preserved? If I change the docker file by h=just changing the version number, and run the docker build again, it will just create another container.
Best practice
The option --volume of the docker-run enables sharing files between host and container(s) and especially preserve consistent [user] data.
The problem is ..
.. it appears that you are not using --volume and that the user data are in the image. (and that's a bad practice beacuse it leads to the situation you are in: unable to upgrade a service easily.
One solution (the best IMO) is
Back-up the user data
To use the command docker-cp: "Copy files/folders between a container and the local filesystem."
docker cp [--help] CONTAINER:SRC_PATH DEST_PATH
Upgrade your Dockerfile
By editing your Dockerfile and changing the version.
Use the --volume option
Use docker run -v /host/path/user-data:container/path/user-data archiva
And you're good!

Copy files from host to docker container then commit and push

I'm using docker in Ubuntu. During development phase I cloned all source code from Git in host, edit them in WebStorm, and them run with Node.js inside a docker container with -v /host_dev_src:/container_src so that I can test.
Then when I wanted to send them for testing: I committed the container and pushed a new version. But when I pulled and ran the image on the test machine, the source code was missing. That makes sense as in test machine there's no /host_src available.
My current workaround is to clone the source code on the test machine and run docker with -v /host_test_src:/container_src. But I'd like to know if it's possible to copy the source code directly into the container and avoid that manipulation. I'd prefer to just copy, paste and run the image file with the source code, especially since there's no Internet connection on our testing machines.
PS: Seems docker cp only supports copying file from container to host.
One solution is to have a git clone step in the Dockerfile which adds the source code into the image. During development, you can override this code with your -v argument to docker run so that you can make changes without rebuilding. When it comes to testing, you just check your changes in and build a new image. Now you have a fully standalone alone image for testing.
Note that if you have a VOLUME instruction in your Dockerfile, you will need to make sure it occurs after the git clone step.
The problem with this approach is that if you are using a compiled language, you only want your binaries to live in the final image. In this case, the git clone needs to be replaced with some code that either fetches or compiles the binaries.
Please treat your source codes are data, then package them as data container , see https://docs.docker.com/userguide/dockervolumes/
Step 1 Create app_src docker image
Put one Dockerfile inside your git repo like
FROM BUSYBOX
ADD . /container_src
VOLUME /container_src
Then you can build source image like
docker build -t app_src .
During development period, you can always use your old solution -v /host_dev_src:/container_src.
Step 2 Transfer this docker image like app image
You can transfer this app_src image to test system similar to your application image, probably via docker registry
Step 3 Run as data container
In test system, run app container above it. (I use ubuntu for demo)
docker run -d -v /container_src --name code app_src
docker run -it --volumes-from code ubuntu bash
root#dfb2bb8456fe:/# ls /container_src
Dockerfile hello.c
root#dfb2bb8456fe:/#
Hope it gives help
(give credits to https://github.com/toffer/docker-data-only-container-demo , which I get detail ideas)
Adding to Adrian's answer, I do git clone, and then do
CMD git pull && start-my-service
so the latest code at the checked out branch gets run. This is obviously not for everyone, but it works in some software release models.
You could try and have two Dockerfiles. The base one would know how to run your app from a predevined folder, but not declare it a volume. When developing you will be running this container with your host folder mounted as a volume. Another one, the package one, will inherit the base one and copy/add the files from your host directory, again without volumes, so that you would carry all the files to the tester's host.

Resources