basically i'm doing a curl and grepping some stuff.
But, i want set the output of this curl to a variable, to then use it on another curl.
e.g:
curl -u asd:asd http://zzz:123/aa/aa.aaa?cmd=ls | grep -B1 -E '<bbb>[4-7]\d{8,}' | grep yyy | tail -n 1 | sed -n -e 's/.*<xxx>\(.*\)<\/xxx>.*/\1/p')
but then I want set the output to a var and use it:
RUN aaa=$(previous curl) && curl -u asd:asd http://$aaa.com
tried with ${aaa}, with "$aaa", etc... didn't work. any solutions?
UPDATE:
something wrong is happening in previous curl 'cause doesn't return the value. probably for not doing the curl
I fear you will not be able to acheive this, because from my understanding RUN statement is to execute a command. To store value you'll have use SET.
For me the following workaround helped
RUN export aaa=$(curl -u asd:asd http://$aaa.com);echo aaa;
You can add the downsteam commands that will use the variable aaa towards right of the semicolon
echo $'one\ntwo\nthree' | grep -F -v $(echo three$'\n'one)
Output should in theory be the string two
I've read that the -F command lets grep interpret each line as a list connected by 'or' qualifier.
Only mistake is some missing double-quotes:
echo $'one\ntwo\nthree' | grep -F -v "$(echo three$'\n'one)"
Also, keep in mind that this will also filter out "threesome", "someone", etc...
(#etan-reisner points out that running set -x before the original and the fixed command can be used to observe the difference the double-quotes make here, and, more generally, is a useful way to debug bash commands.)
I'm using docker registry v1 and I'm interested in migrating to the newer version, v2. But I need some way to get a list of images present on registry; for example with registry v1 I can execute a GET request to http://myregistry:5000/v1/search? and the result is:
{
"num_results": 2,
"query": "",
"results": [
{
"description": "",
"name": "deis/router"
},
{
"description": "",
"name": "deis/database"
}
]
}
But I can't find on official documentation something similar to get a list of image on registry. Anybody knows a way to do it on new version v2?
For the latest (as of 2015-07-31) version of Registry V2, you can get this image from DockerHub:
docker pull distribution/registry:master
List all repositories (effectively images):
curl -X GET https://myregistry:5000/v2/_catalog
> {"repositories":["redis","ubuntu"]}
List all tags for a repository:
curl -X GET https://myregistry:5000/v2/ubuntu/tags/list
> {"name":"ubuntu","tags":["14.04"]}
If the registry needs authentication you have to specify username and password in the curl command
curl -X GET -u <user>:<pass> https://myregistry:5000/v2/_catalog
curl -X GET -u <user>:<pass> https://myregistry:5000/v2/ubuntu/tags/list
you can search on
http://<ip/hostname>:<port>/v2/_catalog
Get catalogs
Default, registry api return 100 entries of catalog, there is the code:
When you curl the registry api:
curl --cacert domain.crt https://your.registry:5000/v2/_catalog
it equivalents with:
curl --cacert domain.crt https://your.registry:5000/v2/_catalog?n=100
This is a pagination methond.
When the sum of entries beyond 100, you can do in two ways:
First: give a bigger number
curl --cacert domain.crt https://your.registry:5000/v2/_catalog?n=2000
Second: parse the next linker url
curl --cacert domain.crt https://your.registry:5000/v2/_catalog
A link element contained in response header:
curl --cacert domain.crt https://your.registry:5000/v2/_catalog
response header:
Link: </v2/_catalog?last=pro-octopus-ws&n=100>; rel="next"
The link element have the last entry of this request, then you can request the next 'page':
curl --cacert domain.crt https://your.registry:5000/v2/_catalog?last=pro-octopus-ws
If the response header contains link element, you can do it in a loop.
Get Images
When you get the result of catalog, it like follows:
{
"repositories": [
"busybox",
"ceph/mds"
]
}
you can get the images in every catalog:
curl --cacert domain.crt https://your.registry:5000/v2/busybox/tags/list
returns:
{"name":"busybox","tags":["latest"]}
The latest version of Docker Registry available from https://github.com/docker/distribution supports Catalog API. (v2/_catalog). This allows for capability to search repositories
If interested, you can try docker image registry CLI I built to make it easy for using the search features in the new Docker Registry distribution (https://github.com/vivekjuneja/docker_registry_cli)
This has been driving me crazy, but I finally put all the pieces together. As of 1/25/2015, I've confirmed that it is possible to list the images in the docker V2 registry ( exactly as #jonatan mentioned, above. )
I would up-vote that answer, if I had the rep for it.
Instead, I'll expand on the answer. Since registry V2 is made with security in mind, I think it's appropriate to include how to set it up with a self signed cert, and run the container with that cert in order that an https call can be made to it with that cert:
This is the script I actually use to start the registry:
sudo docker stop registry
sudo docker rm -v registry
sudo docker run -d \
-p 5001:5001 \
-p 5000:5000 \
--restart=always \
--name registry \
-v /data/registry:/var/lib/registry \
-v /root/certs:/certs \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
-e REGISTRY_HTTP_DEBUG_ADDR=':5001' \
registry:2.2.1
This may be obvious to some, but I always get mixed up with keys and certs. The file that needs to be referenced to make the call #jonaton mentions above**, is the domain.crt listed above. ( Since I put domain.crt in /root, I made a copy into the user directory where it could be accessed. )
curl --cacert ~/domain.crt https://myregistry:5000/v2/_catalog
> {"repositories":["redis","ubuntu"]}
**The command above has been changed: -X GET didn't actually work when I tried it.
Note: https://myregistry:5000 ( as above ) must match the domain given to the cert generated.
We wrote a CLI tool for this purpose: docker-ls It allows you to browse a docker registry and supports authentication via token or basic auth.
Here is a nice little one liner (uses JQ) to print out a list of Repos and associated tags.
If you dont have jq installed you can use: brew install jq
# This is my URL but you can use any
REPO_URL=10.230.47.94:443
curl -k -s -X GET https://$REPO_URL/v2/_catalog \
| jq '.repositories[]' \
| sort \
| xargs -I _ curl -s -k -X GET https://$REPO_URL/v2/_/tags/list
Install registry:2.1.1 or later (you can check the last one, here) and use GET /v2/_catalog to get list.
https://github.com/docker/distribution/blob/master/docs/spec/api.md#listing-repositories
Lista all images by Shell script example:
https://gist.github.com/OndrejP/a2386d08e5308b0776c0
I had to do the same here and the above works except I had to provide login details as it was a local docker repository.
It is as per the above but with supplying the username/password in the URL.
curl -k -X GET https://yourusername:yourpassword#theregistryURL/v2/_catalog
It comes back as unformatted JSON.
I piped it through the python formatter for ease of human reading, in case you would like to have it in this format.
curl -k -X GET https://yourusername:yourpassword#theregistryURL/v2/_catalog | python -m json.tool
Here's an example that lists all tags of all images on the registry. It handles a registry configured for HTTP Basic auth too.
THE_REGISTRY=localhost:5000
# Get username:password from docker configuration. You could
# inject these some other way instead if you wanted.
CREDS=$(jq -r ".[\"auths\"][\"$THE_REGISTRY\"][\"auth\"]" .docker/config.json | base64 -d)
curl -s --user $CREDS https://$THE_REGISTRY/v2/_catalog | \
jq -r '.["repositories"][]' | \
xargs -I #REPO# curl -s --user $CREDS https://$THE_REGISTRY/v2/#REPO#/tags/list | \
jq -M '.["name"] + ":" + .["tags"][]'
Explanation:
extract username:password from .docker/config.json
make a https request to the registry to list all "repositories"
filter the json result to a flat list of repository names
for each repository name:
make a https request to the registry to list all "tags" for that "repository"
filter the stream of result json objects, printing "repository":"tag" pairs for each tag found in each repository
Using "/v2/_catalog" and "/tags/list" endpoints you can't really list all the images. If you pushed a few different images and tagged them "latest" you can't really list the old images! You can still pull them if you refer to them using digest "docker pull ubuntu#sha256:ac13c5d2...". So the answer is - there is no way to list images you can only list tags which is not the same
I wrote an easy-to-use command line tool for listing images in various ways (like list all images, list all tags of those images, list all layers of those tags).
It also allows you to delete unused images in various ways, like delete only older tags of a single image or from all images etc. This is convenient when you are filling your registry from a CI server and want to keep only latest/stable versions.
It is written in python and does not need you to download bulky big custom registry images.
If some on get this far.
Taking what others have already said above. Here is a one-liner that puts the answer into a text file formatted, json.
curl "http://mydocker.registry.domain/v2/_catalog?n=2000" | jq . - > /tmp/registry.lst
This looks like
{
"repositories": [
"somerepo/somecontiner",
"somerepo_other/someothercontiner",
...
]
}
You might need to change the `?n=xxxx' to match how many containers you have.
Next is a way to automatically remove old and unused containers.
This threads dates back a long time, the most recents tools that one should consider are skopeo and crane.
skopeo supports signing and has many other features, while crane is a bit more minimalistic and I found it easier to integrate with in a simple shell script.
Docker search registry v2 functionality is currently not supported at the time of this writing. See discussion since Feb 2015: "propose registry search functionality #206" https://github.com/docker/distribution/issues/206
I wrote a script, view-private-registry, that you can find: https://github.com/BradleyA/Search-docker-registry-v2-script.1.0
It is not pretty but it gets the information needed from the private registry.
Example of output from view-private-registry:
$ view-private-registry`
busybox:latest
gcr.io/google_containers/etcd:2.0.9
gcr.io/google_containers/hyperkube:v0.21.2
gcr.io/google_containers/pause:0.8.0
google/cadvisor:latest
jenkins:latest
logstash:latest
mongo:latest
nginx:latest
python:2.7
redis:latest
registry:2.1.1
stackengine/controller:latest
tomcat:7
tomcat:latest
ubuntu:14.04.2
Number of images: 16
Disk space used: 1.7G /mnt/three/docker-registry/registry-data
One liner bash to list all images with their tags:
curl --user user:pass https://myregistry.com/v2/_catalog | jq .repositories | sed -n 's/[ ",]//gp' | xargs -L1 -IIMAGE curl -s --user user:pass https://myregistry.com/v2/IMAGE/tags/list | jq '. as $parent | .tags[] | $parent.name + ":" + . '
Two lines to search for something in the image name:
search=my_container_part_name
curl --user user:pass https://registry.medworx.io/v2/_catalog | jq .repositories | sed -n '/'"$search"'/{s/[ ",]//gp;}' | xargs -L1 -IIMAGE curl -s --user user:pass https://registry.medworx.io/v2/IMAGE/tags/list | jq '. as $parent | .tags[] | $parent.name + ":" + . '
replace: user, pass and myregistry.com accordingly
uses curl, sed, xargs and jq and is hard to understand... but it does the job. It produces one call per image + 1.
If you can ssh or attach to the docker registry container, just browse the filesystem to look for things you want, like:
kubectl exec -it docker-registry-0 -- /bin/sh
ls /var/lib/registry/docker/registry/v2/repositories
ls /var/lib/registry/docker/registry/v2/repositories/busybox/_manifests/tags/
Since each registry runs as a container the container ID has an associated log file ID-json.log this log file contains the vars.name=[image] and vars.reference=[tag]. A script can be used to extrapolate and print these. This is perhaps one method to list images pushed to registry V2-2.0.1.
If your use-case is identifying only SIGNED and TRUSTED images for production, then this method is handy.
It parses a docker image repo for all SIGNED tags and strips away all the JSON formatting, puking-out only clean image tags. Which of course can be processed further according to your requirements.
Format of Command:
docker trust inspect imageName | grep "SignedTag" | awk -F'"' '{print $4}'
Examples using the nginx & Bitnami Docker repos:
docker trust inspect nginx | grep "SignedTag" | awk -F'"' '{print $4}'
docker trust inspect bitnami/java | grep "SignedTag" | awk -F'"' '{print $4}'
If there are no signed images then No signatures or cannot access imageName will be returned.
Example of a repo WITHOUT signed images (at the time of this writing) using the Wordpress Docker repo:
docker trust inspect wordpress | grep "SignedTag" | awk -F'"' '{print $4}'
If you want a nice web interface to your registry you can use this registry-browser docker image. This is useful if you just want to look around your registry, different repositories and tags.
If, the accepted answer here only returns a blank line, it is likely because of your ssl/tls cert on your registry server. Use the --insecure flag:
curl --insecure https://<registryHostnameOrIP>:5000/v2/_catalog
If I have a document with many links and I want to download especially one picture with the name www.website.de/picture/example_2015-06-15.jpeg, how can I write a command that downloads me automatically exactly this one I extracted out of my document?
My idea would be this, but I'll get a failure message like "wget: URL is missing":
grep -E 'www.website.de/picture/example_2015-06-15.jpeg' document | wget
Use xargs:
grep etc... | xargs wget
It takes its stdin (grep's output), and passes that text as command line arguments to whatever application you tell it to.
For example,
echo hello | xargs echo 'from xargs '
produces:
from xargs hello
Using back ticks would be the easiest way of doing it:
wget `grep -E 'www.website.de/picture/example_2015-06-15.jpeg' document`
This will do too:
wget "$(grep -E 'www.website.de/picture/example_2015-06-15.jpeg' document)"
I'm looking for a way to pseudo-spider a website. The key is that I don't actually want the content, but rather a simple list of URIs. I can get reasonably close to this idea with Wget using the --spider option, but when piping that output through a grep, I can't seem to find the right magic to make it work:
wget --spider --force-html -r -l1 http://somesite.com | grep 'Saving to:'
The grep filter seems to have absolutely no affect on the wget output. Have I got something wrong or is there another tool I should try that's more geared towards providing this kind of limited result set?
UPDATE
So I just found out offline that, by default, wget writes to stderr. I missed that in the man pages (in fact, I still haven't found it if it's in there). Once I piped the return to stdout, I got closer to what I need:
wget --spider --force-html -r -l1 http://somesite.com 2>&1 | grep 'Saving to:'
I'd still be interested in other/better means for doing this kind of thing, if any exist.
The absolute last thing I want to do is download and parse all of the content myself (i.e. create my own spider). Once I learned that Wget writes to stderr by default, I was able to redirect it to stdout and filter the output appropriately.
wget --spider --force-html -r -l2 $url 2>&1 \
| grep '^--' | awk '{ print $3 }' \
| grep -v '\.\(css\|js\|png\|gif\|jpg\)$' \
> urls.m3u
This gives me a list of the content resource (resources that aren't images, CSS or JS source files) URIs that are spidered. From there, I can send the URIs off to a third party tool for processing to meet my needs.
The output still needs to be streamlined slightly (it produces duplicates as it's shown above), but it's almost there and I haven't had to do any parsing myself.
Create a few regular expressions to extract the addresses from all
<a href="(ADDRESS_IS_HERE)">.
Here is the solution I would use:
wget -q http://example.com -O - | \
tr "\t\r\n'" ' "' | \
grep -i -o '<a[^>]\+href[ ]*=[ \t]*"\(ht\|f\)tps\?:[^"]\+"' | \
sed -e 's/^.*"\([^"]\+\)".*$/\1/g'
This will output all http, https, ftp, and ftps links from a webpage. It will not give you relative urls, only full urls.
Explanation regarding the options used in the series of piped commands:
wget -q makes it not have excessive output (quiet mode).
wget -O - makes it so that the downloaded file is echoed to stdout, rather than saved to disk.
tr is the unix character translator, used in this example to translate newlines and tabs to spaces, as well as convert single quotes into double quotes so we can simplify our regular expressions.
grep -i makes the search case-insensitive
grep -o makes it output only the matching portions.
sed is the Stream EDitor unix utility which allows for filtering and transformation operations.
sed -e just lets you feed it an expression.
Running this little script on "http://craigslist.org" yielded quite a long list of links:
http://blog.craigslist.org/
http://24hoursoncraigslist.com/subs/nowplaying.html
http://craigslistfoundation.org/
http://atlanta.craigslist.org/
http://austin.craigslist.org/
http://boston.craigslist.org/
http://chicago.craigslist.org/
http://cleveland.craigslist.org/
...
I've used a tool called xidel
xidel http://server -e '//a/#href' |
grep -v "http" |
sort -u |
xargs -L1 -I {} xidel http://server/{} -e '//a/#href' |
grep -v "http" | sort -u
A little hackish but gets you closer! This is only the first level. Imagine packing this up into a self recursive script!