How to make sh_test depend on building docker image? - bazel

I have a sh_test invoking docker run my_image where my_image is produced by a container_bundle rule. I need the container_bundle rule to be run as dependency to sh_test. How to achieve that? Adding container_bundle to sh_test's data only invokes container_bundle build, but I need run which pushes an image to a docker registry.

What we do is pass the rootpath of the container image rule to the script (as $IMAGE_LOADER) and do:
$IMAGE_LOADER --norun | tee image-loader.log
IMAGE_ID=$(cat ./image-loader.log | grep "Loaded image ID" | cut -d":" -f2-)

The easiest way I know is wrapping a docker_push rule around your bundle. Then your test rule can run the docker_push's output file, which is a binary that will do the docker load. Use runfiles.bash to get its full path.
Something like this:
# --- begin runfiles.bash initialization v2 ---
# Copy-pasted from the Bazel Bash runfiles library v2.
set -uo pipefail; f=bazel_tools/tools/bash/runfiles/runfiles.bash
source "${RUNFILES_DIR:-/dev/null}/$f" 2>/dev/null || \
source "$(grep -sm1 "^$f " "${RUNFILES_MANIFEST_FILE:-/dev/null}" | cut -f2- -d' ')" 2>/dev/null || \
source "$0.runfiles/$f" 2>/dev/null || \
source "$(grep -sm1 "^$f " "$0.runfiles_manifest" | cut -f2- -d' ')" 2>/dev/null || \
source "$(grep -sm1 "^$f " "$0.exe.runfiles_manifest" | cut -f2- -d' ')" 2>/dev/null || \
{ echo>&2 "ERROR: cannot find $f"; exit 1; }; f=; set -e
# --- end runfiles.bash initialization v2 ---
$(rlocation "my_workspace/some/package/my_container_push")
With some/package/BUILD having this:
load("#io_bazel_rules_docker//contrib:push-all.bzl", "docker_push")
load("#io_bazel_rules_docker//container:container.bzl", "container_bundle")
container_bundle(
name = "my_container_bundle",
# All your existing attrs here, etc etc.
)
docker_push(
name = "my_container_push",
bundle = ":my_container_bundle",
)
sh_test(
name = "my_test",
data = [
":my_container_push",
],
deps = [
"#bazel_tools//tools/bash/runfiles",
],
)

Related

Hashicorp Vault inject directory

I want to inject a whole directory using the agent injector.
I would, firstly, like to know if this is even possible.
I will explain myself:
I have this secrets directory: /secret/dev/app/ and under app, I have aws/some_secrets, db/some_secrets, etc...
Is it possible to inject the app directory without having the full secret name?
I would say take a look at Agent Templates.
If you take a look at step 7 of the tutorial:
{{ with secret "secret/data/customers/acme" }}
Organization: {{ .Data.data.organization }}
ID: {{ .Data.data.customer_id }}
Contact: {{ .Data.data.contact_email }}
{{ end }}
You could simply template this template file with a script then run the agent. But your script that generates the dynamic template file would have to do some heavy lifting...
List all secrets under a KV v2 basepath (if the engine mount path has no / characters in it):
#!/usr/bin/env bash
listall() {
kv2opt="/metadata"
if [ "${1}" = "-kv2" ]; then
kv2opt="/metadata"
shift
elif [ "${1}" = "-kv1" ]; then
kv2opt=""
shift
fi
sarg=$(printf '%s' "${1}" | sed -E 's~/*$~~g' | sed -E 's~^/*~~g')
engine=$(printf '%s' "${sarg}" | cut -d/ -f1 )
if [ "$(printf '%s' "${sarg}" | cut -d/ -f2)" = "metadata" ]; then
vpath=$(printf '%s' "${sarg}" | sed -E "s~^${engine}/metadata/?~~g" )
else
vpath=$(printf '%s' "${sarg}" | sed -E "s~^${engine}/?~~g" )
fi
curl -s -H "X-Vault-Request: true" -H "X-Vault-Token: ${VAULT_TOKEN}" --request LIST \
"${VAULT_ADDR}/v1/${engine}${kv2opt}/${vpath}" | jq -rc '.data.keys[]' | while IFS= read -r li; do
if [ "${li: -1}" != "/" ]; then
printf "%s/%s\n" "${sarg}" "${li}"
else
listall "${sarg}/${li}"
fi
done
}
listall -kv2 "secret/dev/app" | while IFS= read -r path; do
cat << EOF >> template.tpl
{{ with secret "${path}" }}
${path}: {{ .Data.data }}
{{ end }}
EOF
done
...and then maybe run the resultant template.tpl file through the Vault Agent using the template process. But that's pretty useless if things have to be read by a machine after the template finishes, so you may need to have a new loop read each secret to figure out what the keys are on each secret. And then do some advanced formatting. However, the way you structured your question, this technically answers it, and you can figure out how to do the rest (or reframe your question, or ask a new question).

cron not running in alpine docker

I have created and added below entry in my entry-point.sh for docker file.
# start cron
/usr/sbin/crond &
exec "${DIST}/bin/ss" "$#"
my crontab.txt looks like below:
bash-4.4$ crontab -l
*/5 * * * * /cleanDisk.sh >> /apps/log/cleanDisk.log
So when I run the docker container, i don't see any file created called as cleanDisk.log.
I have setup all permissions and crond is running as a process in my container see below.
bash-4.4$ ps -ef | grep cron
12 sdc 0:00 /usr/sbin/crond
208 sdc 0:00 grep cron
SO, can anyone, guide me why the log file is not getting created?
my cleanDisk.sh looks like below. Since it runs for very first time,and it doesn't match all the criteria, so I would expect at least to print "No Error file found on Host $(hostname)" in cleanDisk.log.
#!/bin/bash
THRESHOLD_LIMIT=20
RETENTION_DAY=3
df -Ph /apps/ | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5,$1 }' | while read output
do
#echo $output
used=$(echo $output | awk '{print $1}' | sed s/%//g)
partition=$(echo $output | awk '{print $2}')
if [ $used -ge ${THRESHOLD_LIMIT} ]; then
echo "The partition \"$partition\" on $(hostname) has used $used% at $(date)"
FILE_COUNT=$(find ${SDC_LOG} -maxdepth 1 -mtime +${RETENTION_DAY} -type f -name "sdc-*.sdc" -print | wc -l)
if [ ${FILE_COUNT} -gt 0 ]; then
echo "There are ${FILE_COUNT} files older than ${RETENTION_DAY} days on Host $(hostname)."
for FILENAME in $(find ${SDC_LOG} -maxdepth 1 -mtime +${RETENTION_DAY} -type f -name "sdc-*.sdc" -print);
do
ERROR_FILE_SIZE=$(stat -c%s ${FILENAME} | awk '{ split( "B KB MB GB TB PB" , v ); s=1; while( $1>1024 ){ $1/=1024; s++ } printf "%.2f %s\n", $1, v[s] }')
echo "Before Deleting Error file ${FILENAME}, the size was ${ERROR_FILE_SIZE}."
rm -rf ${FILENAME}
rc=$?
if [[ $rc -eq 0 ]];
then
echo "Error log file ${FILENAME} with size ${ERROR_FILE_SIZE} is deleted on Host $(hostname)."
fi
done
fi
if [ ${FILE_COUNT} -eq 0 ]; then
echo "No Error file found on Host $(hostname)."
fi
fi
done
edit
my docker file looks like this
FROM adoptopenjdk/openjdk8:jdk8u192-b12-alpine
ARG SDC_UID=20159
ARG SDC_GID=20159
ARG SDC_USER=sdc
RUN apk add --update --no-cache bash \
busybox-suid \
sudo && \
echo 'hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4' >> /etc/nsswitch.conf
RUN addgroup --system ${SDC_USER} && \
adduser --system --disabled-password -u ${SDC_UID} -G ${SDC_USER} ${SDC_USER}
ADD --chown=sdc:sdc crontab.txt /etc/crontabs/sdc/
RUN chgrp sdc /etc/cron.d /etc/crontabs /usr/bin/crontab
# Also tried to run like this but not working
# RUN /usr/bin/crontab -u sdc /etc/crontabs/sdc/crontab.txt
USER ${SDC_USER}
EXPOSE 18631
RUN /usr/bin/crontab /etc/crontabs/sdc/crontab.txt
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["dc", "-exec"]

RedHat Memory Used High

Looking for some help if you will..
I have a virtual machine on RedHat 6.5 with 32gb memory.
A free is showing 24.6gb used, 8.2gb free. Only 418mb is cached, 1.8gb buffers.
Executed a top and sorted by virtual used, and I can only account for about 6gb of that 24.6gb used.
A "ps aux" doesn't show any processes that could be taking the memory.
I am flummoxed and looking for some advice on where I can look to see whats taking the memory?
Any help would be appreciated.
Below Bash Script will help you figure out which application is consuming how much of memory.
#!/bin/bash
# Make sure only root can run our script
if [ "$(id -u)" != "0" ]; then
echo "This script must be run as root" 1>&2
exit 1
fi
### Functions
#This function will count memory statistic for passed PID
get_process_mem ()
{
PID=$1
#we need to check if 2 files exist
if [ -f /proc/$PID/status ];
then
if [ -f /proc/$PID/smaps ];
then
#here we count memory usage, Pss, Private and Shared = Pss-Private
Pss=`cat /proc/$PID/smaps | grep -e "^Pss:" | awk '{print $2}'| paste -sd+ | bc `
Private=`cat /proc/$PID/smaps | grep -e "^Private" | awk '{print $2}'| paste -sd+ | bc `
#we need to be sure that we count Pss and Private memory, to avoid errors
if [ x"$Rss" != "x" -o x"$Private" != "x" ];
then
let Shared=${Pss}-${Private}
Name=`cat /proc/$PID/status | grep -e "^Name:" |cut -d':' -f2`
#we keep all results in bytes
let Shared=${Shared}*1024
let Private=${Private}*1024
let Sum=${Shared}+${Private}
echo -e "$Private + $Shared = $Sum \t $Name"
fi
fi
fi
}
#this function make conversion from bytes to Kb or Mb or Gb
convert()
{
value=$1
power=0
#if value 0, we make it like 0.00
if [ "$value" = "0" ];
then
value="0.00"
fi
#We make conversion till value bigger than 1024, and if yes we divide by 1024
while [ $(echo "${value} > 1024"|bc) -eq 1 ]
do
value=$(echo "scale=2;${value}/1024" |bc)
let power=$power+1
done
#this part get b,kb,mb or gb according to number of divisions
case $power in
0) reg=b;;
1) reg=kb;;
2) reg=mb;;
3) reg=gb;;
esac
echo -n "${value} ${reg} "
}
#to ensure that temp files not exist
[[ -f /tmp/res ]] && rm -f /tmp/res
[[ -f /tmp/res2 ]] && rm -f /tmp/res2
[[ -f /tmp/res3 ]] && rm -f /tmp/res3
#if argument passed script will show statistic only for that pid, of not – we list all processes in /proc/ #and get statistic for all of them, all result we store in file /tmp/res
if [ $# -eq 0 ]
then
pids=`ls /proc | grep -e [0-9] | grep -v [A-Za-z] `
for i in $pids
do
get_process_mem $i >> /tmp/res
done
else
get_process_mem $1>> /tmp/res
fi
#This will sort result by memory usage
cat /tmp/res | sort -gr -k 5 > /tmp/res2
#this part will get uniq names from process list, and we will add all lines with same process list
#we will count nomber of processes with same name, so if more that 1 process where will be
# process(2) in output
for Name in `cat /tmp/res2 | awk '{print $6}' | sort | uniq`
do
count=`cat /tmp/res2 | awk -v src=$Name '{if ($6==src) {print $6}}'|wc -l| awk '{print $1}'`
if [ $count = "1" ];
then
count=""
else
count="(${count})"
fi
VmSizeKB=`cat /tmp/res2 | awk -v src=$Name '{if ($6==src) {print $1}}' | paste -sd+ | bc`
VmRssKB=`cat /tmp/res2 | awk -v src=$Name '{if ($6==src) {print $3}}' | paste -sd+ | bc`
total=`cat /tmp/res2 | awk '{print $5}' | paste -sd+ | bc`
Sum=`echo "${VmRssKB}+${VmSizeKB}"|bc`
#all result stored in /tmp/res3 file
echo -e "$VmSizeKB + $VmRssKB = $Sum \t ${Name}${count}" >>/tmp/res3
done
#this make sort once more.
cat /tmp/res3 | sort -gr -k 5 | uniq > /tmp/res
#now we print result , first header
echo -e "Private \t + \t Shared \t = \t RAM used \t Program"
#after we read line by line of temp file
while read line
do
echo $line | while read a b c d e f
do
#we print all processes if Ram used if not 0
if [ $e != "0" ]; then
#here we use function that make conversion
echo -en "`convert $a` \t $b \t `convert $c` \t $d \t `convert $e` \t $f"
echo ""
fi
done
done < /tmp/res #this part print footer, with counted Ram usage echo "--------------------------------------------------------" echo -e "\t\t\t\t\t\t `convert $total`" echo "========================================================" # we clean temporary file [[ -f /tmp/res ]] && rm -f /tmp/res [[ -f /tmp/res2 ]] && rm -f /tmp/res2 [[ -f /tmp/res3 ]] && rm -f /tmp/res3
I am going to take a wild stab at this. Without having access to the machine or additional information troubleshooting this will be difficult.
The /tmp file system is special in that it exists entirely in memory. There are a couple others that are like this but /tmp is a special flower. Check the disk usage on this directory and you may see where your memory is getting consumed. ( du -sh /tmp )

How can I find a Docker image with a specific tag in Docker registry on the Docker command line?

I try to locate one specific tag for a Docker image. How can I do it on the command line? I want to avoid downloading all the images and then removing the unneeded ones.
In the official Ubuntu release, https://registry.hub.docker.com/_/ubuntu/, there are several tags (release for it), while when I search it on the command line,
user#ubuntu:~$ docker search ubuntu | grep ^ubuntu
ubuntu Official Ubuntu base image 354
ubuntu-upstart Upstart is an event-based replacement for ... 7
ubuntufan/ping 0
ubuntu-debootstrap 0
Also in the help of command line search https://docs.docker.com/engine/reference/commandline/search/, no clue how it can work?
Is it possible in the docker search command?
If I use a raw command to search via the Docker registry API, then the information can be fetched:
$ curl https://registry.hub.docker.com//v1/repositories/ubuntu/tags | python -mjson.tool
[
{
"layer": "ef83896b",
"name": "latest"
},
.....
{
"layer": "463ff6be",
"name": "raring"
},
{
"layer": "195eb90b",
"name": "saucy"
},
{
"layer": "ef83896b",
"name": "trusty"
}
]
When using CoreOS, jq is available to parse JSON data.
So like you were doing before, looking at library/centos:
$ curl -s -S 'https://registry.hub.docker.com/v2/repositories/library/centos/tags/' | jq '."results"[]["name"]' |sort
"6"
"6.7"
"centos5"
"centos5.11"
"centos6"
"centos6.6"
"centos6.7"
"centos7.0.1406"
"centos7.1.1503"
"latest"
The cleaner v2 API is available now, and that's what I'm using in the example. I will build a simple script docker_remote_tags:
#!/usr/bin/bash
curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$#/tags/" | jq '."results"[]["name"]' |sort
Enables:
$ ./docker_remote_tags library/centos
"6"
"6.7"
"centos5"
"centos5.11"
"centos6"
"centos6.6"
"centos6.7"
"centos7.0.1406"
"centos7.1.1503"
"latest"
Reference:
jq: https://stedolan.github.io/jq/ | apt-get install jq
I didn't like any of the solutions above because A) they required external libraries that I didn't have and didn't want to install. B) I didn't get all the pages.
The Docker API limits you to 100 items per request. This will loop over each "next" item and get them all (for Python it's seven pages; other may be more or less... It depends)
If you really want to spam yourself, remove | cut -d '-' -f 1 from the last line, and you will see absolutely everything.
url=https://registry.hub.docker.com/v2/repositories/library/redis/tags/?page_size=100 `# Initial url` ; \
( \
while [ ! -z $url ]; do `# Keep looping until the variable url is empty` \
>&2 echo -n "." `# Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)` ; \
content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))') `# Curl the URL and pipe the output to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages) then continue to loop over the results extracting only the name; all will be stored in a variable called content` ; \
url=$(echo "$content" | head -n 1) `# Let's get the first line of content which contains the next URL for the loop to continue` ; \
echo "$content" | tail -n +2 `# Print the content without the first line (yes +2 is counter intuitive)` ; \
done; \
>&2 echo `# Finally break the line of dots` ; \
) | cut -d '-' -f 1 | sort --version-sort | uniq;
Sample output:
$ url=https://registry.hub.docker.com/v2/repositories/library/redis/tags/?page_size=100 `#initial url` ; \
> ( \
> while [ ! -z $url ]; do `#Keep looping until the variable url is empty` \
> >&2 echo -n "." `#Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)` ; \
> content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))') `# Curl the URL and pipe the JSON to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages) then continue to loop over the results extracting only the name; all will be store in a variable called content` ; \
> url=$(echo "$content" | head -n 1) `#Let's get the first line of content which contains the next URL for the loop to continue` ; \
> echo "$content" | tail -n +2 `#Print the content with out the first line (yes +2 is counter intuitive)` ; \
> done; \
> >&2 echo `#Finally break the line of dots` ; \
> ) | cut -d '-' -f 1 | sort --version-sort | uniq;
...
2
2.6
2.6.17
2.8
2.8.6
2.8.7
2.8.8
2.8.9
2.8.10
2.8.11
2.8.12
2.8.13
2.8.14
2.8.15
2.8.16
2.8.17
2.8.18
2.8.19
2.8.20
2.8.21
2.8.22
2.8.23
3
3.0
3.0.0
3.0.1
3.0.2
3.0.3
3.0.4
3.0.5
3.0.6
3.0.7
3.0.504
3.2
3.2.0
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
3.2.6
3.2.7
3.2.8
3.2.9
3.2.10
3.2.11
3.2.100
4
4.0
4.0.0
4.0.1
4.0.2
4.0.4
4.0.5
4.0.6
4.0.7
4.0.8
32bit
alpine
latest
nanoserver
windowsservercore
If you want the bash_profile version:
function docker-tags () {
name=$1
# Initial URL
url=https://registry.hub.docker.com/v2/repositories/library/$name/tags/?page_size=100
(
# Keep looping until the variable URL is empty
while [ ! -z $url ]; do
# Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)
>&2 echo -n "."
# Curl the URL and pipe the output to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages)
# then continue to loop over the results extracting only the name; all will be stored in a variable called content
content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))')
# Let's get the first line of content which contains the next URL for the loop to continue
url=$(echo "$content" | head -n 1)
# Print the content without the first line (yes +2 is counter intuitive)
echo "$content" | tail -n +2
done;
# Finally break the line of dots
>&2 echo
) | cut -d '-' -f 1 | sort --version-sort | uniq;
}
And simply call it: docker-tags redis
Sample output:
$ docker-tags redis
...
2
2.6
2.6.17
2.8
--trunc----
32bit
alpine
latest
nanoserver
windowsservercore
As far as I know, the CLI does not allow searching/listing tags in a repository.
But if you know which tag you want, you can pull that explicitly by adding a colon and the image name: docker pull ubuntu:saucy
This script (docker-show-repo-tags.sh) should work for any Docker enabled host that has curl, sed, grep, and sort. This was updated to reflect the fact the repository tag URLs changed.
This version correctly parses the "name": field without a JSON parser.
#!/bin/sh
# 2022-07-20
# Simple script that will display Docker repository tags
# using basic tools: curl, awk, sed, grep, and sort.
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
# $ docker-show-repo-tags.sh centos | cat -n
for Repo in "$#" ; do
URL="https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/"
curl -sS "$URL" | \
/usr/bin/sed -Ee 's/("name":)"([^"]*)"/\n\1\2\n/g' | \
grep '"name":' | \
awk -F: '{printf("'$Repo':%s\n",$2)}'
done
This older version no longer works. Many thanks to #d9k for pointing this out!
#!/bin/sh
# WARNING: This no long works!
# Simple script that will display Docker repository tags
# using basic tools: curl, sed, grep, and sort.
#
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
for Repo in $* ; do
curl -sS "https://hub.docker.com/r/library/$Repo/tags/" | \
sed -e $'s/"tags":/\\\n"tags":/g' -e $'s/\]/\\\n\]/g' | \
grep '^"tags"' | \
grep '"library"' | \
sed -e $'s/,/,\\\n/g' -e 's/,//g' -e 's/"//g' | \
grep -v 'library:' | \
sort -fu | \
sed -e "s/^/${Repo}:/"
done
This older version no longer works. Many thanks to #viky for pointing this out!
#!/bin/sh
# WARNING: This no long works!
# Simple script that will display Docker repository tags.
#
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
for Repo in $* ; do
curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/" | \
sed -e $'s/,/,\\\n/g' -e $'s/\[/\\\[\n/g' | \
grep '"name"' | \
awk -F\" '{print $4;}' | \
sort -fu | \
sed -e "s/^/${Repo}:/"
done
This is the output for a simple example:
$ docker-show-repo-tags.sh centos | cat -n
1 centos:5
2 centos:5.11
3 centos:6
4 centos:6.10
5 centos:6.6
6 centos:6.7
7 centos:6.8
8 centos:6.9
9 centos:7.0.1406
10 centos:7.1.1503
11 centos:7.2.1511
12 centos:7.3.1611
13 centos:7.4.1708
14 centos:7.5.1804
15 centos:centos5
16 centos:centos5.11
17 centos:centos6
18 centos:centos6.10
19 centos:centos6.6
20 centos:centos6.7
21 centos:centos6.8
22 centos:centos6.9
23 centos:centos7
24 centos:centos7.0.1406
25 centos:centos7.1.1503
26 centos:centos7.2.1511
27 centos:centos7.3.1611
28 centos:centos7.4.1708
29 centos:centos7.5.1804
30 centos:latest
I wrote a command line tool to simplify searching Docker Hub repository tags, available in my PyTools GitHub repository. It's simple to use with various command line switches, but most basically:
./dockerhub_show_tags.py repo1 repo2
It's even available as a Docker image and can take multiple repositories:
docker run harisekhon/pytools dockerhub_show_tags.py centos ubuntu
DockerHub
repo: centos
tags: 5.11
6.6
6.7
7.0.1406
7.1.1503
centos5.11
centos6.6
centos6.7
centos7.0.1406
centos7.1.1503
repo: ubuntu
tags: latest
14.04
15.10
16.04
trusty
trusty-20160503.1
wily
wily-20160503
xenial
xenial-20160503
If you want to embed it in scripts, use -q / --quiet to get just the tags, like normal Docker commands:
./dockerhub_show_tags.py centos -q
5.11
6.6
6.7
7.0.1406
7.1.1503
centos5.11
centos6.6
centos6.7
centos7.0.1406
centos7.1.1503
The v2 API seems to use some kind of pagination, so that it does not return all the available tags. This is clearly visible in projects such as python (or library/python). Even after quickly reading the documentation, I could not manage to work with the API correctly (maybe it is the wrong documentation).
Then I rewrote the script using the v1 API, and it is still using jq:
#!/bin/bash
repo="$1"
if [[ "${repo}" != */* ]]; then
repo="library/${repo}"
fi
url="https://registry.hub.docker.com/v1/repositories/${repo}/tags"
curl -s -S "${url}" | jq '.[]["name"]' | sed 's/^"\(.*\)"$/\1/' | sort
The full script is available at: https://github.com/denilsonsa/small_scripts/blob/master/docker_remote_tags.sh
I've also written an improved version (in Python) that aggregates tags that point to the same version: https://github.com/denilsonsa/small_scripts/blob/master/docker_remote_tags.py
Add this function to your .zshrc file or run the command manually:
#usage list-dh-tags <repo>
#example: list-dh-tags node
function list-dh-tags(){
wget -q https://registry.hub.docker.com/v1/repositories/$1/tags -O - | sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | tr '}' '\n' | awk -F: '{print $3}'
}
Thanks to this -> How can I list all tags for a Docker image on a remote registry?
For anyone stumbling across this in modern times, you can use Skopeo to retrieve an image's tags from the Docker registry:
$ skopeo list-tags docker://jenkins/jenkins \
| jq -r '.Tags[] | select(. | contains("lts-alpine"))' \
| sort --version-sort --reverse
lts-alpine
2.277.3-lts-alpine
2.277.2-lts-alpine
2.277.1-lts-alpine
2.263.4-lts-alpine
2.263.3-lts-alpine
2.263.2-lts-alpine
2.263.1-lts-alpine
2.249.3-lts-alpine
2.249.2-lts-alpine
2.249.1-lts-alpine
2.235.5-lts-alpine
2.235.4-lts-alpine
2.235.3-lts-alpine
2.235.2-lts-alpine
2.235.1-lts-alpine
2.222.4-lts-alpine
Reimplementation of the previous post, using Python over sed/AWK:
for Repo in $* ; do
tags=$(curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/")
python - <<EOF
import json
tags = [t['name'] for t in json.loads('''$tags''')['results']]
tags.sort()
for tag in tags:
print "{}:{}".format('$Repo', tag)
EOF
done
For a script that works with OAuth bearer tokens on Docker Hub, try this:
Listing the tags of a Docker image on a Docker hub through the HTTP API
You can use Visual Studio Code to provide autocomplete for available Docker images and tags. However, this requires that you type the first letter of a tag in order to see autocomplete suggestions.
For example, when writing FROM ubuntu it offers autocomplete suggestions like ubuntu, ubuntu-debootstrap and ubuntu-upstart. When writing FROM ubuntu:a it offers autocomplete suggestions, like ubuntu:artful and ubuntu:artful-20170511.1

Parse URL in shell script

I have url like:
sftp://user#host.net/some/random/path
I want to extract user, host and path from this string. Any part can be random length.
[EDIT 2019]
This answer is not meant to be a catch-all, works for everything solution it was intended to provide a simple alternative to the python based version and it ended up having more features than the original.
It answered the basic question in a bash-only way and then was modified multiple times by myself to include a hand full of demands by commenters. I think at this point however adding even more complexity would make it unmaintainable. I know not all things are straight forward (checking for a valid port for example requires comparing hostport and host) but I would rather not add even more complexity.
[Original answer]
Assuming your URL is passed as first parameter to the script:
#!/bin/bash
# extract the protocol
proto="$(echo $1 | grep :// | sed -e's,^\(.*://\).*,\1,g')"
# remove the protocol
url="$(echo ${1/$proto/})"
# extract the user (if any)
user="$(echo $url | grep # | cut -d# -f1)"
# extract the host and port
hostport="$(echo ${url/$user#/} | cut -d/ -f1)"
# by request host without port
host="$(echo $hostport | sed -e 's,:.*,,g')"
# by request - try to extract the port
port="$(echo $hostport | sed -e 's,^.*:,:,g' -e 's,.*:\([0-9]*\).*,\1,g' -e 's,[^0-9],,g')"
# extract the path (if any)
path="$(echo $url | grep / | cut -d/ -f2-)"
echo "url: $url"
echo " proto: $proto"
echo " user: $user"
echo " host: $host"
echo " port: $port"
echo " path: $path"
I must admit this is not the cleanest solution but it doesn't rely on another scripting
language like perl or python.
(Providing a solution using one of them would produce cleaner results ;) )
Using your example the results are:
url: user#host.net/some/random/path
proto: sftp://
user: user
host: host.net
port:
path: some/random/path
This will also work for URLs without a protocol/username or path.
In this case the respective variable will contain an empty string.
[EDIT]
If your bash version won't cope with the substitutions (${1/$proto/}) try this:
#!/bin/bash
# extract the protocol
proto="$(echo $1 | grep :// | sed -e's,^\(.*://\).*,\1,g')"
# remove the protocol -- updated
url=$(echo $1 | sed -e s,$proto,,g)
# extract the user (if any)
user="$(echo $url | grep # | cut -d# -f1)"
# extract the host and port -- updated
hostport=$(echo $url | sed -e s,$user#,,g | cut -d/ -f1)
# by request host without port
host="$(echo $hostport | sed -e 's,:.*,,g')"
# by request - try to extract the port
port="$(echo $hostport | sed -e 's,^.*:,:,g' -e 's,.*:\([0-9]*\).*,\1,g' -e 's,[^0-9],,g')"
# extract the path (if any)
path="$(echo $url | grep / | cut -d/ -f2-)"
The above, refined (added password and port parsing), and working in /bin/sh:
# extract the protocol
proto="`echo $DATABASE_URL | grep '://' | sed -e's,^\(.*://\).*,\1,g'`"
# remove the protocol
url=`echo $DATABASE_URL | sed -e s,$proto,,g`
# extract the user and password (if any)
userpass="`echo $url | grep # | cut -d# -f1`"
pass=`echo $userpass | grep : | cut -d: -f2`
if [ -n "$pass" ]; then
user=`echo $userpass | grep : | cut -d: -f1`
else
user=$userpass
fi
# extract the host -- updated
hostport=`echo $url | sed -e s,$userpass#,,g | cut -d/ -f1`
port=`echo $hostport | grep : | cut -d: -f2`
if [ -n "$port" ]; then
host=`echo $hostport | grep : | cut -d: -f1`
else
host=$hostport
fi
# extract the path (if any)
path="`echo $url | grep / | cut -d/ -f2-`"
Posted b/c I needed it, so I wrote it (based on #Shirkin's answer, obviously), and I figured someone else might appreciate it.
This solution in principle works the same as Adam Ryczkowski's, in this thread - but has improved regular expression based on RFC3986, (with some changes) and fixes some errors (e.g. userinfo can contain '_' character). This can also understand relative URIs (e.g. to extract query or fragment).
# !/bin/bash
# Following regex is based on https://www.rfc-editor.org/rfc/rfc3986#appendix-B with
# additional sub-expressions to split authority into userinfo, host and port
#
readonly URI_REGEX='^(([^:/?#]+):)?(//((([^:/?#]+)#)?([^:/?#]+)(:([0-9]+))?))?(/([^?#]*))(\?([^#]*))?(#(.*))?'
# ↑↑ ↑ ↑↑↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑
# |2 scheme | ||6 userinfo 7 host | 9 port | 11 rpath | 13 query | 15 fragment
# 1 scheme: | |5 userinfo# 8 :… 10 path 12 ?… 14 #…
# | 4 authority
# 3 //…
parse_scheme () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[2]}"
}
parse_authority () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[4]}"
}
parse_user () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[6]}"
}
parse_host () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[7]}"
}
parse_port () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[9]}"
}
parse_path () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[10]}"
}
parse_rpath () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[11]}"
}
parse_query () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[13]}"
}
parse_fragment () {
[[ "$#" =~ $URI_REGEX ]] && echo "${BASH_REMATCH[15]}"
}
Using Python (best tool for this job, IMHO):
#!/usr/bin/env python
import os
from urlparse import urlparse
uri = os.environ['NAUTILUS_SCRIPT_CURRENT_URI']
result = urlparse(uri)
user, host = result.netloc.split('#')
path = result.path
print('user=', user)
print('host=', host)
print('path=', path)
Further reading:
os.environ
urlparse.urlparse()
If you really want to do it in shell, you can do something as simple as the following by using awk. This requires knowing how many fields you will actually be passed (e.g. no password sometimes and not others).
#!/bin/bash
FIELDS=($(echo "sftp://user#host.net/some/random/path" \
| awk '{split($0, arr, /[\/\#:]*/); for (x in arr) { print arr[x] }}'))
proto=${FIELDS[1]}
user=${FIELDS[2]}
host=${FIELDS[3]}
path=$(echo ${FIELDS[#]:3} | sed 's/ /\//g')
If you don't have awk and you do have grep, and you can require that each field have at least two characters and be reasonably predictable in format, then you can do:
#!/bin/bash
FIELDS=($(echo "sftp://user#host.net/some/random/path" \
| grep -o "[a-z0-9.-][a-z0-9.-]*" | tr '\n' ' '))
proto=${FIELDS[1]}
user=${FIELDS[2]}
host=${FIELDS[3]}
path=$(echo ${FIELDS[#]:3} | sed 's/ /\//g')
Just needed to do the same, so was curious if it's possible to do it in single line, and this is what i've got:
#!/bin/bash
parse_url() {
eval $(echo "$1" | sed -e "s#^\(\(.*\)://\)\?\(\([^:#]*\)\(:\(.*\)\)\?#\)\?\([^/?]*\)\(/\(.*\)\)\?#${PREFIX:-URL_}SCHEME='\2' ${PREFIX:-URL_}USER='\4' ${PREFIX:-URL_}PASSWORD='\6' ${PREFIX:-URL_}HOST='\7' ${PREFIX:-URL_}PATH='\9'#")
}
URL=${1:-"http://user:pass#example.com/path/somewhere"}
PREFIX="URL_" parse_url "$URL"
echo "$URL_SCHEME://$URL_USER:$URL_PASSWORD#$URL_HOST/$URL_PATH"
How it works:
There is that crazy sed regex that captures all the parts of url, when all of them are optional (except for the host name)
Using those capture groups sed outputs env variables names with their values for relevant parts (like URL_SCHEME or URL_USER)
eval executes that output, causing those variables to be exported and available in the script
Optionally PREFIX could be passed to control output env variables names
PS: be careful when using this for arbitrary input since this code is vulnerable to script injections.
Here's my take, loosely based on some of the existing answers, but it can also cope with GitHub SSH clone URLs:
#!/bin/bash
PROJECT_URL="git#github.com:heremaps/here-aaa-java-sdk.git"
# Extract the protocol (includes trailing "://").
PARSED_PROTO="$(echo $PROJECT_URL | sed -nr 's,^(.*://).*,\1,p')"
# Remove the protocol from the URL.
PARSED_URL="$(echo ${PROJECT_URL/$PARSED_PROTO/})"
# Extract the user (includes trailing "#").
PARSED_USER="$(echo $PARSED_URL | sed -nr 's,^(.*#).*,\1,p')"
# Remove the user from the URL.
PARSED_URL="$(echo ${PARSED_URL/$PARSED_USER/})"
# Extract the port (includes leading ":").
PARSED_PORT="$(echo $PARSED_URL | sed -nr 's,.*(:[0-9]+).*,\1,p')"
# Remove the port from the URL.
PARSED_URL="$(echo ${PARSED_URL/$PARSED_PORT/})"
# Extract the path (includes leading "/" or ":").
PARSED_PATH="$(echo $PARSED_URL | sed -nr 's,[^/:]*([/:].*),\1,p')"
# Remove the path from the URL.
PARSED_HOST="$(echo ${PARSED_URL/$PARSED_PATH/})"
echo "proto: $PARSED_PROTO"
echo "user: $PARSED_USER"
echo "host: $PARSED_HOST"
echo "port: $PARSED_PORT"
echo "path: $PARSED_PATH"
which gives
proto:
user: git#
host: github.com
port:
path: :heremaps/here-aaa-java-sdk.git
And for PROJECT_URL="ssh://sschuberth#git.eclipse.org:29418/jgit/jgit" you get
proto: ssh://
user: sschuberth#
host: git.eclipse.org
port: :29418
path: /jgit/jgit
You can use bash string manipulation. It is easy to learn. In case you feel difficulties with regex, try it. As it is from NAUTILUS_SCRIPT_CURRENT_URI, i guess there may have port in that URI. So I also kept that optional.
#!/bin/bash
#You can also use environment variable $NAUTILUS_SCRIPT_CURRENT_URI
X="sftp://user#host.net/some/random/path"
tmp=${X#*//};usr=${tmp%#*}
tmp=${X#*#};host=${tmp%%/*};[[ ${X#*://} == *":"* ]] && host=${host%:*}
tmp=${X#*//};path=${tmp#*/}
proto=${X%:*}
[[ ${X#*://} == *":"* ]] && tmp=${X##*:} && port=${tmp%%/*}
echo "Potocol:"$proto" User:"$usr" Host:"$host" Port:"$port" Path:"$path
I don't have enough reputation to comment, but I made a small modification to #patryk-obara's answer.
RFC3986 § 6.2.3. Scheme-Based Normalization
treats
http://example.com
http://example.com/
as equivalent. But I found that his regex did not match a URL like http://example.com. http://example.com/ (with the trailing slash) does match.
I inserted 11, which changed / to (/|$). This matches either / or the end of the string. Now http://example.com does match.
readonly URI_REGEX='^(([^:/?#]+):)?(//((([^:/?#]+)#)?([^:/?#]+)(:([0-9]+))?))?((/|$)([^?#]*))(\?([^#]*))?(#(.*))?$'
# ↑↑ ↑ ↑↑↑ ↑ ↑ ↑ ↑↑ ↑ ↑ ↑ ↑ ↑
# || | ||| | | | || | | | | |
# |2 scheme | ||6 userinfo 7 host | 9 port || 12 rpath | 14 query | 16 fragment
# 1 scheme: | |5 userinfo# 8 :... || 13 ?... 15 #...
# | 4 authority |11 / or end-of-string
# 3 //... 10 path
If you have access to Bash >= 3.0 you can do this in pure bash as well, thanks to the re-match operator =~:
pattern='^(([[:alnum:]]+)://)?(([[:alnum:]]+)#)?([^:^#]+)(:([[:digit:]]+))?$'
if [[ "http://us#cos.com:3142" =~ $pattern ]]; then
proto=${BASH_REMATCH[2]}
user=${BASH_REMATCH[4]}
host=${BASH_REMATCH[5]}
port=${BASH_REMATCH[7]}
fi
It should be faster and less resource-hungry then all the previous examples, because no external process is be spawned.
A simplistic approach to get just the domain from the full URL:
echo https://stackoverflow.com/questions/6174220/parse-url-in-shell-script | cut -d/ -f1-3
# OUTPUT>>> https://stackoverflow.com
Get only the path:
echo https://stackoverflow.com/questions/6174220/parse-url-in-shell-script | cut -d/ -f4-
# OUTPUT>>> questions/6174220/parse-url-in-shell-script
Not perfect, as the second command strips the preceding slash so you'll need to prepend it by hand.
An awk-based approach for getting just the path without the domain:
echo https://stackoverflow.com/questions/6174220/parse-url-in-shell-script/59971653 | awk -F"/" '{ for (i=4; i<=NF; i++) printf"/%s", $i }'
# OUTPUT>>> /questions/6174220/parse-url-in-shell-script/59971653
I did further parsing, expanding the solution given by #Shirkrin:
#!/bin/bash
parse_url() {
local query1 query2 path1 path2
# extract the protocol
proto="$(echo $1 | grep :// | sed -e's,^\(.*://\).*,\1,g')"
if [[ ! -z $proto ]] ; then
# remove the protocol
url="$(echo ${1/$proto/})"
# extract the user (if any)
login="$(echo $url | grep # | cut -d# -f1)"
# extract the host
host="$(echo ${url/$login#/} | cut -d/ -f1)"
# by request - try to extract the port
port="$(echo $host | sed -e 's,^.*:,:,g' -e 's,.*:\([0-9]*\).*,\1,g' -e 's,[^0-9],,g')"
# extract the uri (if any)
resource="/$(echo $url | grep / | cut -d/ -f2-)"
else
url=""
login=""
host=""
port=""
resource=$1
fi
# extract the path (if any)
path1="$(echo $resource | grep ? | cut -d? -f1 )"
path2="$(echo $resource | grep \# | cut -d# -f1 )"
path=$path1
if [[ -z $path ]] ; then path=$path2 ; fi
if [[ -z $path ]] ; then path=$resource ; fi
# extract the query (if any)
query1="$(echo $resource | grep ? | cut -d? -f2-)"
query2="$(echo $query1 | grep \# | cut -d\# -f1 )"
query=$query2
if [[ -z $query ]] ; then query=$query1 ; fi
# extract the fragment (if any)
fragment="$(echo $resource | grep \# | cut -d\# -f2 )"
echo "url: $url"
echo " proto: $proto"
echo " login: $login"
echo " host: $host"
echo " port: $port"
echo "resource: $resource"
echo " path: $path"
echo " query: $query"
echo "fragment: $fragment"
echo ""
}
parse_url "http://login:password#example.com:8080/one/more/dir/file.exe?a=sth&b=sth#anchor_fragment"
parse_url "https://example.com/one/more/dir/file.exe#anchor_fragment"
parse_url "http://login:password#example.com:8080/one/more/dir/file.exe#anchor_fragment"
parse_url "ftp://user#example.com:8080/one/more/dir/file.exe?a=sth&b=sth"
parse_url "/one/more/dir/file.exe"
parse_url "file.exe"
parse_url "file.exe#anchor"
I did not like above methods and wrote my own. It is for ftp link, just replace ftp with http if your need it.
First line is a small validation of link, link should look like ftp://user:pass#host.com/path/to/something.
if ! echo "$url" | grep -q '^[[:blank:]]*ftp://[[:alnum:]]\+:[[:alnum:]]\+#[[:alnum:]\.]\+/.*[[:blank:]]*$'; then return 1; fi
login=$( echo "$url" | sed 's|[[:blank:]]*ftp://\([^:]\+\):\([^#]\+\)#\([^/]\+\)\(/.*\)[[:blank:]]*|\1|' )
pass=$( echo "$url" | sed 's|[[:blank:]]*ftp://\([^:]\+\):\([^#]\+\)#\([^/]\+\)\(/.*\)[[:blank:]]*|\2|' )
host=$( echo "$url" | sed 's|[[:blank:]]*ftp://\([^:]\+\):\([^#]\+\)#\([^/]\+\)\(/.*\)[[:blank:]]*|\3|' )
dir=$( echo "$url" | sed 's|[[:blank:]]*ftp://\([^:]\+\):\([^#]\+\)#\([^/]\+\)\(/.*\)[[:blank:]]*|\4|' )
My actual goal was to check ftp access by url. Here is the full result:
#!/bin/bash
test_ftp_url() # lftp may hang on some ftp problems, like no connection
{
local url="$1"
if ! echo "$url" | grep -q '^[[:blank:]]*ftp://[[:alnum:]]\+:[[:alnum:]]\+#[[:alnum:]\.]\+/.*[[:blank:]]*$'; then return 1; fi
local login=$( echo "$url" | sed 's|[[:blank:]]*ftp://\([^:]\+\):\([^#]\+\)#\([^/]\+\)\(/.*\)[[:blank:]]*|\1|' )
local pass=$( echo "$url" | sed 's|[[:blank:]]*ftp://\([^:]\+\):\([^#]\+\)#\([^/]\+\)\(/.*\)[[:blank:]]*|\2|' )
local host=$( echo "$url" | sed 's|[[:blank:]]*ftp://\([^:]\+\):\([^#]\+\)#\([^/]\+\)\(/.*\)[[:blank:]]*|\3|' )
local dir=$( echo "$url" | sed 's|[[:blank:]]*ftp://\([^:]\+\):\([^#]\+\)#\([^/]\+\)\(/.*\)[[:blank:]]*|\4|' )
exec 3>&2 2>/dev/null
exec 6<>"/dev/tcp/$host/21" || { exec 2>&3 3>&-; echo 'Bash network support is disabled. Skipping ftp check.'; return 0; }
read <&6
if ! echo "${REPLY//$'\r'}" | grep -q '^220'; then exec 2>&3 3>&- 6>&-; return 3; fi # 220 vsFTPd 3.0.2+ (ext.1) ready...
echo -e "USER $login\r" >&6; read <&6
if ! echo "${REPLY//$'\r'}" | grep -q '^331'; then exec 2>&3 3>&- 6>&-; return 4; fi # 331 Please specify the password.
echo -e "PASS $pass\r" >&6; read <&6
if ! echo "${REPLY//$'\r'}" | grep -q '^230'; then exec 2>&3 3>&- 6>&-; return 5; fi # 230 Login successful.
echo -e "CWD $dir\r" >&6; read <&6
if ! echo "${REPLY//$'\r'}" | grep -q '^250'; then exec 2>&3 3>&- 6>&-; return 6; fi # 250 Directory successfully changed.
echo -e "QUIT\r" >&6
exec 2>&3 3>&- 6>&-
return 0
}
test_ftp_url 'ftp://fz223free:fz223free#ftp.zakupki.gov.ru/out/nsi/nsiProtocol/daily'
echo "$?"
I found Adam Ryczkowski's answers helpful. The original solution did not handle /path in URL, so I enhanced it a little bit.
pattern='^(([[:alnum:]]+):\/\/)?(([[:alnum:]]+)#)?([^:^#\/]+)(:([[:digit:]]+))?(\/?[^:^#]?)$'
url="http://us#cos.com:3142/path"
if [[ "$url" =~ $pattern ]]; then
proto=${BASH_REMATCH[2]}
user=${BASH_REMATCH[4]}
host=${BASH_REMATCH[5]}
port=${BASH_REMATCH[7]}
path=${BASH_REMATCH[8]}
echo "proto: $proto"
echo "user: $user"
echo "host: $host"
echo "port: $port"
echo "path= $path"
else
echo "URL did not match pattern: $url"
fi
The pattern is complex, so please use this site to understand it better: https://regex101.com/
I tested it with a bunch of URLs. However, if there are any issues, please let me know.
If you have access to Node.js:
export MY_URI=sftp://user#host.net/some/random/path
node -e "console.log(url.parse(process.env.MY_URI).user)"
node -e "console.log(url.parse(process.env.MY_URI).host)"
node -e "console.log(url.parse(process.env.MY_URI).path)"
This will output:
user
host.net
/some/random/path
Here's a pure bash url parser. It supports git ssh clone style URLs as well as standard proto:// ones. The example ignores protocol, auths, and port but you can modify to collect as needed... I used regex101 for handy testing: https://regex101.com/r/5QyNI5/1
TEST_URLS=(
https://github.com/briceburg/tools.git
https://foo:12333#github.com:8080/briceburg/tools.git
git#github.com:briceburg/tools.git
https://me#gmail.com:12345#my.site.com:443/p/a/t/h
)
for url in "${TEST_URLS[#]}"; do
without_proto="${url#*:\/\/}"
without_auth="${without_proto##*#}"
[[ $without_auth =~ ^([^:\/]+)(:[[:digit:]]+\/|:|\/)?(.*) ]]
PROJECT_HOST="${BASH_REMATCH[1]}"
PROJECT_PATH="${BASH_REMATCH[3]}"
echo "given: $url"
echo " -> host: $PROJECT_HOST path: $PROJECT_PATH"
done
results in:
given: https://github.com/briceburg/tools.git
-> host: github.com path: briceburg/tools.git
given: https://foo:12333#github.com:8080/briceburg/tools.git
-> host: github.com path: briceburg/tools.git
given: git#github.com:briceburg/tools.git
-> host: github.com path: briceburg/tools.git
given: https://me#gmail.com:12345#my.site.com:443/p/a/t/h
-> host: my.site.com path: p/a/t/h

Resources