Remote Path for monitoring changes - monitoring

I`ve created simple script which is based on inotify-tools, but finally after when i decided to monitor /remotepath, which was mounted from NAS by command mount.cifs, it wasnt work.
So after some investigation i found information, that inotify-tools has not support for remote folder.
Does any one of You have any expirience with simple tool which will give me a chance, to watch remote folder, and if something will change, then will run rsync.
Maybe i should go only with rsync and sync remote folder with new files only ?
Thanks for any ideas.
In the mean time i created some simple bash script which doing this what i want, but i fighting with a problem, what will happend if something will be deleted from destination folder and i dont want to synchronize this deleted file again.
Any idea how to fix this problem ?
#!/bin/bash
### Logs path
path="/var/log/compare"
log="compare.log"
listing1="listing1.log"
listing2="listing2.log"
### Path which will be monitored
destination="/path/to/destination/"
source="/path/to/remote/folder"
## Watching for content in source folder
ls -lh $source > $path/$listing1
### I`m checking if something was changed
echo "$(date)" 'INFO' 'I will compare listing files' >> "$path/$log"
if cmp -s "$path/$listing1" "$path/$listing2"
### Files are the same
then
echo "$(date)" 'INFO' 'Listings are the same' >> "$path/$log"
### Files are different
else
rsync -art $source $destination
echo "$(date)" 'INFO' 'Finished synchronization' >> "$path/$log"
fi
cp $path/$listing1 $path/$listing2

inotify is indeed the wrong tool for the job; it works by intercepting filesystem activity in the kernel, so remote activity will be totally missed.
An ideal solution would be to run inotify on the NAS box. This is certainly possible with some devices, but you don't say what device you have (and if you did I probably don't have the same one).
Any other tool that exists will just do exactly what your script does (albeit maybe more prettily), so I don't think you need to pursue that avenue.
In any case, your script is entirely redundant! If you just ran rsync periodically it would do exactly the same thing (rsync -t just compares the file names, sizes, and timestamps). The only difference is that rsync will compare the remote file list against your real files, not a cached copy of the file-list. Incidentally, -a implies both -r and -t, so rsync -a is sufficient.
One thing I would look into: running rsync on the NAS device directly. Accessing the file list through CIFS is less efficient that running it locally, so if your NAS can support rsync then do it that way. (Synology NAS boxes have rsync, but I don't know about other devices.)

Related

Is there any way to make a COPY command on Dockerfile silent or less verbose?

We have a Dockerfile which copies the folder to another folder.
Something like
COPY . /mycode
But the problem is that there is tons of files in the generated code, and it creates 10K+ lines on the jenkins log where we are running the CICD pipeline.
copying /bla/blah/bla to copying /bla/blah/bla 10k times.
Is there a way to make this COPY less verbose or silent? jenkins admin has already warned us that our log file is nearing his max limit.
You can tar/zip the files on the host, so there's only one file to copy. Then untar/unzip after it's been copied and direct the output of untar/unzip to /dev/null.
By definition the docker cp command doesn't have any "silent" switches, but perhaps redirecting the output may help, have you tried:
docker cp -a CONTAINER:SRC_PATH DEST_PATH|- &> /dev/null
I know it's not the most elegant, but if what you seek is to supress the console output, that may help.
Sources
[1] https://docs.docker.com/engine/reference/commandline/cp/
[2] https://csatlas.com/bash-redirect-stdout-stderr/

Alternative to using --squash when building docker images on Windows using local files

We have some local installers and zip files that we use to build our docker images. It is easy to get this to work in a Dockerfile:
FROM mcr.microsoft.com/windows/nanoserver
COPY myinstaller.exe .
RUN myinstaller.exe; \
del myinstaller.exe
The problem here is that it produces a layer for the COPY line, which increases the size of the image. A common work-around for this is to have one RUN line, that downloads the file from the Internet, runs commands, and then deletes the installation file. The problem, as written above, is that the installers are on the local filesystem.
I found that there is a --squash command for docker:
docker build --squash -t mytestimage .
This does exactly what I want: It gives me an image without this extra installer file that is not necessary. To run this command, you need to enable experimental features though. There is also an open issue to simply remove this feature:
https://github.com/moby/moby/issues/34565
Is there some alternative way of using local installers in a Dockerfile when running on Windows, that doesn't involve setting up a server to provide the files?
We ended up setting up nginx to provide files when building. On our build server, the machine building our docker images and the server that has the installer files have a very good connection between them, so downloading huge files is not a real problem.
When it comes to --squash, it is bugged for Docker on Windows. Here is the relevant issue for it:
https://github.com/moby/moby/issues/31468
There is an issue to move --squash out of experimental, but it doesn't seem to have a lot of support:
https://github.com/moby/moby/issues/38657
The alternative that some people propose instead of --squash is multi stage build, discussion here:
https://github.com/moby/moby/issues/34565
There is an alternative to --squash, if you have local installer files, you don't want to set up a web server, and you would like your docker image to be small, and you are running Windows: Use mapped drives.
In Windows, you can share folders with other users on your network. Docker containers are like another computer that is running on your physical machine, and it can access these network drives.
First set up a new user, for example username share and password password1. Create a folder somewhere on your computer. Then right click it, click properties, and then go to the Sharing tab and click "Share". Find the user that you have just created, using the little dropdown menu and Find people ..., and share the folder with this user.
Create a folder somewhere for your test project. Create a batch file setupshare.bat that looks like this:
#echo off
for /f "tokens=2 delims=:" %%i in ('ipconfig ^| findstr "Default Gateway"') do (
set hostip=%%i
goto :end
)
:end
set hostip=%hostip: =%
net use O: \\%hostip%\vms /USER:share password1
The first part of this file is only to find the ip address that the docker container can use to access its host computer. It is not the most pretty thing I've ever put together, so let me know if there's a better way!
It uses a for-loop, as that is the way to save the output of a command to a variable in batch files. The command is ipconfig, and we pipe it to findstr and searches for Default Gateway. We need to use ^| instead of just | because it is in a for-loop. The first part of the for-loop divides each line from the command on the delimiter, which is : in this case, and we only take the second token. The for-loop only handles the first line, if there are multiple entries with a Default Gateway. This script doesn't work if there are multiple entries and the first one is not the correct one.
The line set hostip=%hostip: =% is to remove a space at the start of the string.
We then have the IP address that we want to use stored in hostip. We use this in the net use command, which will map O:\ to shared folder vms on the machine with IP hostip. We use the username share and the password password1. Note that this is a very bad way of handling passwords, as they kind of should be secret!
With a batch file like this, we can set up a Dockerfile in this way:
# escape=`
FROM mcr.microsoft.com/dotnet/core/sdk:3.0
COPY setupshare.bat .
RUN setupshare.bat && `
copy O:\file.txt file.txt
The RUN command will first call setupshare.bat that sets up the network share properly. We can then use any file that we shared, for example a huge installer, and install the things that we want. In this case I have only shared a test file file.txt to see that it works, so just change that line.
I would still advice everyone to just set up a little web server, for example nginx, and use the standard way of writing Dockerfiles, with downloading files and running it in the same RUN command. That's what people expect when they see a Dockerfile, and it should be a more robust solution.
We can also hope that the Docker people either makes a COPY command that can copy, run, and delete installers in the same layer, or that --squash is implemented properly.

Move file downloaded in Dockerfile to harddrive

First off, I really lack a lot of knowledge regarding Docker itself and its structure. I know that it'd be way more beneficial to learn the basics first, but I do require this to work in order to move on to other things for now.
So within a Dockerfile I installed wget & used it to download a file from a website, authentification & download are successful. However, when I later try move said file it can't be found, and it doesn't show up using e.g explorer either (path was specified)
I thought it might have something to do with RUN & how it executes the wget command; I read that the Id can be used to copy it to harddrive, but how'd I do that within a Dockerfile?
RUN wget -P ./path/to/somewhere http://something.com/file.txt --username xyz --password bluh
ADD ./path/to/somewhere/file.txt /mainDirectory
Download is shown & log-in is successful, but as I mentioned I am having trouble using that file later on as it's not to be located on the harddrive. Probably a basic error, but I'd really appreciate some input that might lead to a solution.
Obviously the error is produced when trying to execute ADD as there is no file to move. I am trying to find a way to mount a volume in order to store it, but so far in vain.
Edit:
Though the question is similiar to the "move to harddrive" one, I am searching for ways to get the id of the container created within the Dockerfile in order to move it; while the thread provides such answers, I haven't had any luck using them within the Dockerfile itself.
Short answer is that it's not possible.
The Dockerfile builds an image, which you can run as a short-lived container. During the build, you don't have (write) access to the host and its file system. Which kinda makes sense, since you want to build an immutable image from which to run ephemeral containers.
What you can do is run a container, and mount a path from your host as a volume into the container. This is the only way how you can share files between the host and a container.
Here is an example how you could do this with the sherylynn/wget image:
docker run -v /path/on/host:/path/in/container sherylynn/wget wget -O /path/in/container/file http://my.url.com
The -v HOST:CONTAINER parameter allows you to specify a path on the host that is mounted inside the container at a specified location.
For wget, I would prefer -O over -P when downloading a single file, since it makes it really explicit where your download ends up. When you point -O to the location of the volume, the downloaded file ends up on the host system (in the folder you mounted).
Since I have no idea what your image or your environment looks like, you might need to tweak one or two things to work well with your own image. As a general recommendation: For basic commands like wget or curl, you can find pre-made images on Docker Hub. This can be quite useful when you need to set up a Continuous Integration pipeline or so, where you want to use wget or curl but can't execute it directly.
Use wget -O instead of -P for specific file download
for e.g.,
RUN wget -O /tmp/new_file.txt http://something.com --username xyz --password bluh/new_file.txt
Thanks

Copying docker image folder between partition with rsync

I'd like to move my /var/lib/docker data to another place, and to make it safe, i'd like to use rsync.
But the data are stored with sparse files, and rsync does not seem to handle it properly.
What would be the right parameters for rsync?
-a preserves properly the uid/gid+rights
-S handle sparse files efficiently, but rsync never seems to end
Without -S, rsync tries to copy more data than the original location can contain (100G on a 48G partition). With the -S, I seem to be stuck forever after about 10G.
It seems that rsync -avXS is working like a charm.
Should you rsync a /var/lib/docker to a remote server be sure to tell rsync to do no mapping of the uids and gids between the two systems. Otherwise you could end up with wrong ownerships of files in your containers.
so this would create an exact copy:
rsync -avHXS --numeric-ids /var/lib/docker/. root#some.host.com:/var/lib/docker

Rsync folder with a million files, but very small incremental daily updates

we run an rsync on a large folder. This has close to a million files inside it including html, jsp, gif/jpg, etc. We only need to of course incrementally update files. Just a few JSP and HTML files are updated in this folder, and we need this server to rsync to a different server, same folder.
Sometimes rsync is running quite slow these days, so one of our IT team members created this command:
find /usr/home/foldername \
-type f -name *.jsp -exec \
grep -l <ssi:include src=[^$]*${ {} ;`
This looks for only specific files which have JSP extension and which contain certain kinds of text, because these are the files which we need to rsync. But this command is consuming a lot of memory. I think it's a stupid way to rsync, but I'm being told this is how things will work.
Some googling suggests that this should work on this folder too:
rsync -a --update --progress --rsh --partial /usr/home/foldername /destination/server
I'm worried that this will be too slow on a daily basis, but I can't imagine why this will be slower than that silly find option that our IT folks are recommending. Any ideas about large directory rsyncs in the real world?
A find command will not be faster than the rsync scan, and the grep command must be slower than rsync because it requires reading all the text from all the .jsp files.
The only way a find-and-grep could be faster is if
The timestamps on your files do not match, so rsync has to checksum the contents (on both sides!)
This seems unlikely, since you're using -a that will sync the timestamps properly (because -a implies -t). However, it can happen if the file-systems on the different machines allow different timestamp precision (e.g. Linux vs. Windows), in which case the --modify-window option is what you need.
There are many more files changed than the ones you care about, and rsync is transferring those also.
If this is the case then you can limit the transfer to .jsp files like this:
--include '*.jsp' --include '*/' --exclude '*'
(Include all .jsp files and all directories, but exclude everything else.)
rsync does the scan up front, then does the compare (possibly using lots of RAM), then does the transfer, where as find/grep/copy does it now.
This used to be a problem, but rsync ought to do an incremental recursive scan as long as both local and remote versions are 3.0.0 or greater, and you don't use any of the fancy delete or delay options that force an up-front scan (see --recursive in the documentation).

Resources