Copying docker image folder between partition with rsync - docker

I'd like to move my /var/lib/docker data to another place, and to make it safe, i'd like to use rsync.
But the data are stored with sparse files, and rsync does not seem to handle it properly.
What would be the right parameters for rsync?
-a preserves properly the uid/gid+rights
-S handle sparse files efficiently, but rsync never seems to end
Without -S, rsync tries to copy more data than the original location can contain (100G on a 48G partition). With the -S, I seem to be stuck forever after about 10G.

It seems that rsync -avXS is working like a charm.

Should you rsync a /var/lib/docker to a remote server be sure to tell rsync to do no mapping of the uids and gids between the two systems. Otherwise you could end up with wrong ownerships of files in your containers.
so this would create an exact copy:
rsync -avHXS --numeric-ids /var/lib/docker/. root#some.host.com:/var/lib/docker

Related

Exclude a directory from `podman/docker export` stream and save to a file

I have a container that I want to export as a .tar file. I have used a podman run with a tar --exclude=/dir1 --exclude=/dir2 … that outputs to a file located on a bind-mounted host dir. But recently this has been giving me some tar: .: file changed as we read it errors, which podman/docker export would avoid. Besides the export I suppose is more efficient. So I'm trying to migrate to using the export, but the major obstacle is I can't seem to find a way to exclude paths from the tar stream.
If possible, I'd like to avoid modifying a tar archive already saved on disk, and instead modify the stream before it gets saved to a file.
I've been banging my head for multiple hours, trying useless advices from ChatGPT, looking at cpio, and attempting to pipe the podman export to tar --exclude … command. With the last I did have small success at some point, but couldn't make tar save the result to a particularly named file.
Any suggestions?
(note: I do not make distinction between docker and podman here as their export command is completely the same, and it's useful for searchability)

Is there any way to make a COPY command on Dockerfile silent or less verbose?

We have a Dockerfile which copies the folder to another folder.
Something like
COPY . /mycode
But the problem is that there is tons of files in the generated code, and it creates 10K+ lines on the jenkins log where we are running the CICD pipeline.
copying /bla/blah/bla to copying /bla/blah/bla 10k times.
Is there a way to make this COPY less verbose or silent? jenkins admin has already warned us that our log file is nearing his max limit.
You can tar/zip the files on the host, so there's only one file to copy. Then untar/unzip after it's been copied and direct the output of untar/unzip to /dev/null.
By definition the docker cp command doesn't have any "silent" switches, but perhaps redirecting the output may help, have you tried:
docker cp -a CONTAINER:SRC_PATH DEST_PATH|- &> /dev/null
I know it's not the most elegant, but if what you seek is to supress the console output, that may help.
Sources
[1] https://docs.docker.com/engine/reference/commandline/cp/
[2] https://csatlas.com/bash-redirect-stdout-stderr/

"Device or resource busy" when i try move /etc/resolv.conf in ubuntu:18.04. How fix it?

I have a VPN client in my Docker container (ubuntu:18.04).
The client must do the following:
mv /etc/resolv.conf /etc/resolv.conf.orig
Then the client should create new /etc/resolv.conf with their DNS servers. However, the move fails with an error:
mv: cannot move '/etc/resolv.conf' to '/etc/resolv.conf.orig': Device or resource busy
Can this be fixed? Thank you advance.
P.S.: I can 't change the VPN client code.
Within the Docker container the /etc/resolv.conf file is not an ordinary regular file. Docker manages it in a special manner: the container engine writes container-specific configuration into the file outside of the container and bind-mounts it to /etc/resolv.conf inside the container.
When your VPN client runs mv /etc/resolv.conf /etc/resolv.conf.orig, things boil down to the rename(2) syscall (or similar call from this family), and, according to the manpage for this syscall, EBUSY (Device or resource busy) error could be returned by few reasons, including the situation when the original file is a mountpoint:
EBUSY
The rename fails because oldpath or newpath is a directory that is in use by some process (perhaps as current working directory, or as root directory, or
because it was open for reading) or is in use by the system (for example as mount point), while the system considers this an error. (Note that there is no
requirement to return EBUSY in such cases — there is nothing wrong with doing the rename anyway — but it is allowed to return EBUSY if the system cannot otherwise handle such situations.)
Though there is a remark that the error is not guaranteed to be produced in such circumstances, it seems that it always fires for bind-mount targets (I guess that probably this happens here):
$ touch sourcefile destfile
$ sudo mount --bind sourcefile destfile
$ mv destfile anotherfile
mv: cannot move 'destfile' to 'anotherfile': Device or resource busy
So, similarly, you cannot move /etc/resolv.conf inside the container, for it is a bind-mount, and there is no straight solution.
Given that the bind-mount of /etc/resolv.conf is a read-write mount, not a read-only one, it is still possible to overwrite this file:
$ mount | grep resolv.conf
/dev/sda1 on /etc/resolv.conf type ext4 (rw,relatime)
So, the possible fix could be to try copying this file to the .orig backup and then rewriting the original one instead of renaming the original file and then re-creating it.
Unfortunately, this does not meet your restrictions (I can 't change the VPN client code.), so I bet that you are out of luck here.
Any method that requires moving a file onto /etc/resolv.conf fails in docker container.
The workaround is to rewrite the original file instead of moving or renaming a modified version onto it.
For example, use the following at a bash prompt:
(rc=$(sed 's/^\(nameserver 192\.168\.\)/# \1/' /etc/resolv.conf)
echo "$rc" > /etc/resolv.conf)
This works by rewriting /etc/resolv.conf as follows:
read and modify the current contents of /etc/resov.conf through the stream editor, sed
the sed script in this example is for commenting out lines starting with nameserver 192.168.
save the updated contents in a variable, rc
overwrite the original file /etc/resolv.conf with updated contents in "$rc"
The command list is in parentheses to operate in a sub-shell to avoid polluting the current shell's name space with a variable name rc, just in case it happens to be in use.
Note that this command does not require sudo since it is taking advantage of the super user privileges available by default inside the container.
Note that sed -i (editing in-place) involves moving the updated file onto the original and will not work.
But if the visual editor, vi, is available in the container, editing and saving /etc/resolv.conf with vi works, since vi modifies the original file directly.

Remote Path for monitoring changes

I`ve created simple script which is based on inotify-tools, but finally after when i decided to monitor /remotepath, which was mounted from NAS by command mount.cifs, it wasnt work.
So after some investigation i found information, that inotify-tools has not support for remote folder.
Does any one of You have any expirience with simple tool which will give me a chance, to watch remote folder, and if something will change, then will run rsync.
Maybe i should go only with rsync and sync remote folder with new files only ?
Thanks for any ideas.
In the mean time i created some simple bash script which doing this what i want, but i fighting with a problem, what will happend if something will be deleted from destination folder and i dont want to synchronize this deleted file again.
Any idea how to fix this problem ?
#!/bin/bash
### Logs path
path="/var/log/compare"
log="compare.log"
listing1="listing1.log"
listing2="listing2.log"
### Path which will be monitored
destination="/path/to/destination/"
source="/path/to/remote/folder"
## Watching for content in source folder
ls -lh $source > $path/$listing1
### I`m checking if something was changed
echo "$(date)" 'INFO' 'I will compare listing files' >> "$path/$log"
if cmp -s "$path/$listing1" "$path/$listing2"
### Files are the same
then
echo "$(date)" 'INFO' 'Listings are the same' >> "$path/$log"
### Files are different
else
rsync -art $source $destination
echo "$(date)" 'INFO' 'Finished synchronization' >> "$path/$log"
fi
cp $path/$listing1 $path/$listing2
inotify is indeed the wrong tool for the job; it works by intercepting filesystem activity in the kernel, so remote activity will be totally missed.
An ideal solution would be to run inotify on the NAS box. This is certainly possible with some devices, but you don't say what device you have (and if you did I probably don't have the same one).
Any other tool that exists will just do exactly what your script does (albeit maybe more prettily), so I don't think you need to pursue that avenue.
In any case, your script is entirely redundant! If you just ran rsync periodically it would do exactly the same thing (rsync -t just compares the file names, sizes, and timestamps). The only difference is that rsync will compare the remote file list against your real files, not a cached copy of the file-list. Incidentally, -a implies both -r and -t, so rsync -a is sufficient.
One thing I would look into: running rsync on the NAS device directly. Accessing the file list through CIFS is less efficient that running it locally, so if your NAS can support rsync then do it that way. (Synology NAS boxes have rsync, but I don't know about other devices.)

Rsync folder with a million files, but very small incremental daily updates

we run an rsync on a large folder. This has close to a million files inside it including html, jsp, gif/jpg, etc. We only need to of course incrementally update files. Just a few JSP and HTML files are updated in this folder, and we need this server to rsync to a different server, same folder.
Sometimes rsync is running quite slow these days, so one of our IT team members created this command:
find /usr/home/foldername \
-type f -name *.jsp -exec \
grep -l <ssi:include src=[^$]*${ {} ;`
This looks for only specific files which have JSP extension and which contain certain kinds of text, because these are the files which we need to rsync. But this command is consuming a lot of memory. I think it's a stupid way to rsync, but I'm being told this is how things will work.
Some googling suggests that this should work on this folder too:
rsync -a --update --progress --rsh --partial /usr/home/foldername /destination/server
I'm worried that this will be too slow on a daily basis, but I can't imagine why this will be slower than that silly find option that our IT folks are recommending. Any ideas about large directory rsyncs in the real world?
A find command will not be faster than the rsync scan, and the grep command must be slower than rsync because it requires reading all the text from all the .jsp files.
The only way a find-and-grep could be faster is if
The timestamps on your files do not match, so rsync has to checksum the contents (on both sides!)
This seems unlikely, since you're using -a that will sync the timestamps properly (because -a implies -t). However, it can happen if the file-systems on the different machines allow different timestamp precision (e.g. Linux vs. Windows), in which case the --modify-window option is what you need.
There are many more files changed than the ones you care about, and rsync is transferring those also.
If this is the case then you can limit the transfer to .jsp files like this:
--include '*.jsp' --include '*/' --exclude '*'
(Include all .jsp files and all directories, but exclude everything else.)
rsync does the scan up front, then does the compare (possibly using lots of RAM), then does the transfer, where as find/grep/copy does it now.
This used to be a problem, but rsync ought to do an incremental recursive scan as long as both local and remote versions are 3.0.0 or greater, and you don't use any of the fancy delete or delay options that force an up-front scan (see --recursive in the documentation).

Resources