Set a curl output to a variable in a Dockerfile - docker

basically i'm doing a curl and grepping some stuff.
But, i want set the output of this curl to a variable, to then use it on another curl.
e.g:
curl -u asd:asd http://zzz:123/aa/aa.aaa?cmd=ls | grep -B1 -E '<bbb>[4-7]\d{8,}' | grep yyy | tail -n 1 | sed -n -e 's/.*<xxx>\(.*\)<\/xxx>.*/\1/p')
but then I want set the output to a var and use it:
RUN aaa=$(previous curl) && curl -u asd:asd http://$aaa.com
tried with ${aaa}, with "$aaa", etc... didn't work. any solutions?
UPDATE:
something wrong is happening in previous curl 'cause doesn't return the value. probably for not doing the curl

I fear you will not be able to acheive this, because from my understanding RUN statement is to execute a command. To store value you'll have use SET.
For me the following workaround helped
RUN export aaa=$(curl -u asd:asd http://$aaa.com);echo aaa;
You can add the downsteam commands that will use the variable aaa towards right of the semicolon

Related

JSONing a Grepped Environment (jq + grep + env)

I want to get only AWS_ environments variables from the environment and output them into a JSON.
Usually, doing env | grep AWS_ shows me the right env vars, and jq -n env would show the whole env as JSON.
I've tried:
jq -n $(env | grep AWS_)
and
jq -n $(env $(grep AWS_))
both without success.
There are many options, but since (as you point out) jq has an env filter that produces the JSON for you, it would make sense to use it without having to invoke grep and then parse the output to convert it to JSON. For example:
jq -n 'env | with_entries(select(.key | test("AWS_")))'
You might want to change the test to use "^AWS_".

How do I grep for a pattern in which the pattern is a shell-expansion that generates a list?

echo $'one\ntwo\nthree' | grep -F -v $(echo three$'\n'one)
Output should in theory be the string two
I've read that the -F command lets grep interpret each line as a list connected by 'or' qualifier.
Only mistake is some missing double-quotes:
echo $'one\ntwo\nthree' | grep -F -v "$(echo three$'\n'one)"
Also, keep in mind that this will also filter out "threesome", "someone", etc...
(#etan-reisner points out that running set -x before the original and the fixed command can be used to observe the difference the double-quotes make here, and, more generally, is a useful way to debug bash commands.)

How to pass a URL to Wget

If I have a document with many links and I want to download especially one picture with the name www.website.de/picture/example_2015-06-15.jpeg, how can I write a command that downloads me automatically exactly this one I extracted out of my document?
My idea would be this, but I'll get a failure message like "wget: URL is missing":
grep -E 'www.website.de/picture/example_2015-06-15.jpeg' document | wget
Use xargs:
grep etc... | xargs wget
It takes its stdin (grep's output), and passes that text as command line arguments to whatever application you tell it to.
For example,
echo hello | xargs echo 'from xargs '
produces:
from xargs hello
Using back ticks would be the easiest way of doing it:
wget `grep -E 'www.website.de/picture/example_2015-06-15.jpeg' document`
This will do too:
wget "$(grep -E 'www.website.de/picture/example_2015-06-15.jpeg' document)"

DELETE using CURL with encoded URL

I’m trying to make a request using CURL like this:
curl -X DELETE "https://myhost/context/path/users/OXYugGKg207g5uN/07V"
where OXYugGKg207g5uN/07V is a hash, so I suppose that I need to encode before do this request.
I have tried curl -X DELETE --data-urlenconded "https://myhost/context/path/users/OXYugGKg207g5uN/07V"
Some ideas?
Since you are in a bash environment, you could encode the hash OXYugGKg207g5uN/07V before passing it to curl.
A straight-forward approach would be to use its byte representation %4f%58%59%75%67%47%4b%67%32%30%37%67%35%75%4e%2f%30%37%56
To get that, call:
echo -ne "OXYugGKg207g5uN/07V" | xxd -plain | tr -d '\n' | sed 's/\(..\)/%\1/g'
It will give you: %4f%58%59%75%67%47%4b%67%32%30%37%67%35%75%4e%2f%30%37%56
The complete one-liner including curl in bash/zsh/sh/… would look like this:
curl -X DELETE "https://myhost/context/path/users/$(echo -ne "OXYugGKg207g5uN/07V" | xxd -plain | tr -d '\n' | sed 's/\(..\)/%\1/g')"
and is equivalent to
curl -X DELETE "https://myhost/context/path/users/%4f%58%59%75%67%47%4b%67%32%30%37%67%35%75%4e%2f%30%37%56"
This solution is not very pretty, but it works.
I hope you find this helpful.
If really OXYugGKg207g5uN/07V is the hash then you need to encode that, not the whole url. You can use an encoding function available inside the environment you use cURL in.

Spider a Website and Return URLs Only

I'm looking for a way to pseudo-spider a website. The key is that I don't actually want the content, but rather a simple list of URIs. I can get reasonably close to this idea with Wget using the --spider option, but when piping that output through a grep, I can't seem to find the right magic to make it work:
wget --spider --force-html -r -l1 http://somesite.com | grep 'Saving to:'
The grep filter seems to have absolutely no affect on the wget output. Have I got something wrong or is there another tool I should try that's more geared towards providing this kind of limited result set?
UPDATE
So I just found out offline that, by default, wget writes to stderr. I missed that in the man pages (in fact, I still haven't found it if it's in there). Once I piped the return to stdout, I got closer to what I need:
wget --spider --force-html -r -l1 http://somesite.com 2>&1 | grep 'Saving to:'
I'd still be interested in other/better means for doing this kind of thing, if any exist.
The absolute last thing I want to do is download and parse all of the content myself (i.e. create my own spider). Once I learned that Wget writes to stderr by default, I was able to redirect it to stdout and filter the output appropriately.
wget --spider --force-html -r -l2 $url 2>&1 \
| grep '^--' | awk '{ print $3 }' \
| grep -v '\.\(css\|js\|png\|gif\|jpg\)$' \
> urls.m3u
This gives me a list of the content resource (resources that aren't images, CSS or JS source files) URIs that are spidered. From there, I can send the URIs off to a third party tool for processing to meet my needs.
The output still needs to be streamlined slightly (it produces duplicates as it's shown above), but it's almost there and I haven't had to do any parsing myself.
Create a few regular expressions to extract the addresses from all
<a href="(ADDRESS_IS_HERE)">.
Here is the solution I would use:
wget -q http://example.com -O - | \
tr "\t\r\n'" ' "' | \
grep -i -o '<a[^>]\+href[ ]*=[ \t]*"\(ht\|f\)tps\?:[^"]\+"' | \
sed -e 's/^.*"\([^"]\+\)".*$/\1/g'
This will output all http, https, ftp, and ftps links from a webpage. It will not give you relative urls, only full urls.
Explanation regarding the options used in the series of piped commands:
wget -q makes it not have excessive output (quiet mode).
wget -O - makes it so that the downloaded file is echoed to stdout, rather than saved to disk.
tr is the unix character translator, used in this example to translate newlines and tabs to spaces, as well as convert single quotes into double quotes so we can simplify our regular expressions.
grep -i makes the search case-insensitive
grep -o makes it output only the matching portions.
sed is the Stream EDitor unix utility which allows for filtering and transformation operations.
sed -e just lets you feed it an expression.
Running this little script on "http://craigslist.org" yielded quite a long list of links:
http://blog.craigslist.org/
http://24hoursoncraigslist.com/subs/nowplaying.html
http://craigslistfoundation.org/
http://atlanta.craigslist.org/
http://austin.craigslist.org/
http://boston.craigslist.org/
http://chicago.craigslist.org/
http://cleveland.craigslist.org/
...
I've used a tool called xidel
xidel http://server -e '//a/#href' |
grep -v "http" |
sort -u |
xargs -L1 -I {} xidel http://server/{} -e '//a/#href' |
grep -v "http" | sort -u
A little hackish but gets you closer! This is only the first level. Imagine packing this up into a self recursive script!

Resources