Word splitting in bash with input from a file - parsing

I'm having some trouble getting bash to play nicely with parsing words off the command line. I find it easiest to give an example, so without further ado, here we go.
This is the script test.sh:
#!/bin/bash
echo "inside test with $# arguments"
if [[ $# -eq 0 ]]
then
data=cat data.txt
echo ./test $data
./test $data
else
for arg in "$#"
do
echo "Arg is \"$arg\""
done
fi
And here is the file data.txt:
"abc 123" 1 2 3 "how are you"
The desired output of
$ test.sh
is
inside test with 0 arguments
./test "abc 123" 1 2 3 "how are you"
inside test with 5 arguments
Arg is "abc 123"
Arg is "1"
Arg is "2"
Arg is "3"
Arg is "how are you"
But instead, I'm getting
inside test with 0 arguments
./test "abc 123" 1 2 3 "how are you"
inside test with 8 arguments
Arg is ""abc"
Arg is "123""
Arg is "1"
Arg is "2"
Arg is "3"
Arg is ""how"
Arg is "are"
Arg is "you""
The really annoying thing is that if I execute the command which is dumped from line 7 of test.sh, I do get the desired output (sans the first two lines of course).
So in essence, is there any way to get bash to parse words if given input which has been read from a file?

You can use eval for this:
eval ./test "$data"
You must be careful to know that you can trust the contents of the file when you use eval. To demonstrate why, add ; pwd at the end of the line in your data file. When you run your script, the current directory will be printed. Imagine if it was something destructive.
It might be better if you can choose a delimiter other than a space between fields in your data file. For example, if you use tabs, you could do:
while IFS=$'\t' read -r -a array
do
for item in "${array[#]}"
do
something with "$item"
done
done < data.txt
You wouldn't need to quote fields that contain spaces.
This is a correction to what I presume was a typo in your question:
data=$(cat data.txt)

No need to call the script twice.
If you find there are no arguments, you can use set to change them to something else, e.g.:
#!/bin/bash
if [ $# -eq 0 ]
then
echo "inside test with $# arguments"
eval set -- $(<data.txt)
fi
echo "inside test with $# arguments"
for arg in "$#"
do
echo "Arg is \"$arg\""
done

Yes, simply replace
./test $data
with
eval ./test $data

Related

Expand ARG/ENV in CMD dockerfile

I have a Dockerfile and I am taking in a LAMBDA_NAME from a jenkins pipeline.
I am passing in something like this: source-producer
And I want to call the handler of this function, which is named handler in the code.
This code does not work
ARG LAMBDA_NAME
ENV LAMBDA_HANDLER="${LAMBDA_NAME}.handler"
RUN echo "${LAMBDA_HANDLER}"
CMD [ "${LAMBDA_HANDLER}" ]
The result of the run echo step gives back "sourceproducer.handler", which is correct.
The code above produces this error
[ERROR] Runtime.MalformedHandlerName: Bad handler '${LAMBDA_HANDLER}': not enough values to unpack (expected 2, got 1)
But, when this same value is hardcoded, it works fine and executes the lambda function.
ARG LAMBDA_NAME
ENV LAMBDA_HANDLER="${LAMBDA_NAME}.handler"
RUN echo "${LAMBDA_HANDLER}"
CMD [ "sourceproducer.handler" ]
How can I correctly use LAMBDA_HANDLER inside of CMD so that the top code block executes the same as the one below it?
You need to use the shell form of the CMD statement. With the exec form of the statement, as you have now, there's no shell to replace environment variable.
Use
CMD "${LAMBDA_HANDLER}"
instead.
This is equivalent to this, using the exec form, which you can also use, if you prefer the exec form
CMD [ "/bin/sh", "-c", "${LAMBDA_HANDLER}" ]

Are ARG values passed to scripts executed with RUN

If I have ARG values and then RUN a script in my Dockerfile, will that script be able to access those values while it runs during build time?
OK, I checked it and it seems to work in my limited example:
ARG foo=val1
ARG bar=val2
COPY foo.sh /foo.sh
RUN /foo.sh
And foo.sh:
echo ============
echo hello, world
echo $foo
echo $bar
echo ============
This prints:
---> Running in c6c8d99ba28b
============
hello, world
val1
val2
============
Removing intermediate container c6c8d99ba28b

Access environment variable value in docker ENTRYPOINT ( exec ) from second parameter(with customerentrypoint script as first parameter)

I want to access the value of one of environment variable in my dockerfile , and pass it as first argument to the main script in docker ENTRYPOINT.
I came across this so link which shows two ways to do it. one with exec form and one with shell form.
The exec form worked fine to echo the environment variable with ["sh", "-c", "echo $VARIABLE"] but when I tried with my custom entrypoint script ENTRYPOINT ["/bin/customentrypoint.sh", "$VARIABLE"] it is not able to get the value for variable, instead its just taking it as constant $VARIABLE.
So I went with shell form approach and just called ENTRYPOINT /bin/customentrypoing "$VARIABLE", and it worked fine to get the value of $VARIABLE but It seems that its restricting the no of command line arguments in this case. as I am getting only one value of $# even after passing other command line arguments from docker run.Can someone please help me if I am doing something wrong , or I should tackle this in different way.Thanks in Advance.
docker looks is similar to
#!/usr/bin/env bash
...
ENV VARIABLE NO
...
RUN echo "#!/bin/bash" > /bin/customentrypoint.sh
RUN echo "if [ "\"\$1\"" = 'YES' ] ; then ; python ${LOCATION}/main.py" \"\$#\" "; else ; echo Please select -e VARIABLE=YES ; fi" >> /bin/customentrypoint.sh
RUN chmod +x /bin/customentrypoint.sh
RUN ln -s -T /bin/customentrypoint.sh /bin/customentrypoint
WORKDIR ${LOCATION}
ENTRYPOINT /bin/customentrypoint "$VARIABLE" # - works fine but limits no of command line arguments
# ENTRYPOINT ["bin/customentrypoint", "$VARIABLE"] # not able to get value of $VARIABLE instead taking as constant.
command I am using
docker run --rm -v $PWD:/mnt -e VARIABLE=VALUE docker_image:tag entrypoint -d /mnt/tmp -i /mnt/input_file
The environment for CMD is interpreted slightly differently depending on how you write the arguments. If you pass the CMD as a string (not inside an array), it gets launched as a shell instead of exec. See https://docs.docker.com/engine/reference/builder/#cmd.
What you can try if you want to use array is
ENTRYPOINT ["/bin/sh", "-c", "echo ${VARIABLE}"]

Difference between CMD echo 'Hello world' and CMD ["echo", ''Hello world'] in a dockerfile?

Just I would like to know if there is some difference behind the scenes between doing:
CMD echo 'Hello world'
and
CMD ['echo', 'Hello World']
I know that both will print it on the console , but if there is some difference, when use each option?
There is not much difference in your simple example, but there will be a difference if you need shell features such as variable substitution e.g. $HOME.
This is the shell form. It will invoke a command shell:
CMD echo 'Hello world'
This is the exec form. It does not invoke a command shell:
CMD ["/usr/bin/echo", "Hello World"]
The exec form is parsed as a JSON array, which means that you should be using double-quotes around words, not single-quotes, and you should give the full path to an executable. The exec form is the preferred format of CMD.

Shell scripting input redirection oddities

Can anyone explain this behavior?
Running:
#!/bin/sh
echo "hello world" | read var1 var2
echo $var1
echo $var2
results in nothing being ouput, while:
#!/bin/sh
echo "hello world" > test.file
read var1 var2 < test.file
echo $var1
echo $var2
produces the expected output:
hello
world
Shouldn't the pipe do in one step what the redirection to test.file did in the second example? I tried the same code with both the dash and bash shells and got the same behavior from both of them.
A recent addition to bash is the lastpipe option, which allows the last command in a pipeline to run in the current shell, not a subshell, when job control is deactivated.
#!/bin/bash
set +m # Deactiveate job control
shopt -s lastpipe
echo "hello world" | read var1 var2
echo $var1
echo $var2
will indeed output
hello
world
This has already been answered correctly, but the solution has not been stated yet. Use ksh, not bash. Compare:
$ echo 'echo "hello world" | read var1 var2
echo $var1
echo $var2' | bash -s
To:
$ echo 'echo "hello world" | read var1 var2
echo $var1
echo $var2' | ksh -s
hello
world
ksh is a superior programming shell because of little niceties like this. (bash is the better interactive shell, in my opinion.)
#!/bin/sh
echo "hello world" | read var1 var2
echo $var1
echo $var2
produces no output because pipelines run each of their components inside a subshell. Subshells inherit copies of the parent shell's variables, rather than sharing them. Try this:
#!/bin/sh
foo="contents of shell variable foo"
echo $foo
(
echo $foo
foo="foo contents modified"
echo $foo
)
echo $foo
The parentheses define a region of code that gets run in a subshell, and $foo retains its original value after being modified inside them.
Now try this:
#!/bin/sh
foo="contents of shell variable foo"
echo $foo
{
echo $foo
foo="foo contents modified"
echo $foo
}
echo $foo
The braces are purely for grouping, no subshell is created, and the $foo modified inside the braces is the same $foo modified outside them.
Now try this:
#!/bin/sh
echo "hello world" | {
read var1 var2
echo $var1
echo $var2
}
echo $var1
echo $var2
Inside the braces, the read builtin creates $var1 and $var2 properly and you can see that they get echoed. Outside the braces, they don't exist any more. All the code within the braces has been run in a subshell because it's one component of a pipeline.
You can put arbitrary amounts of code between braces, so you can use this piping-into-a-block construction whenever you need to run a block of shell script that parses the output of something else.
read var1 var2 < <(echo "hello world")
The post has been properly answered, but I would like to offer an alternative one liner that perhaps could be of some use.
For assigning space separated values from echo (or stdout for that matter) to shell variables, you could consider using shell arrays:
$ var=( $( echo 'hello world' ) )
$ echo ${var[0]}
hello
$ echo ${var[1]}
world
In this example var is an array and the contents can be accessed using the construct ${var[index]}, where index is the array index (starts with 0).
That way you can have as many parameters as you want assigned to the relevant array index.
Allright, I figured it out!
This is a hard bug to catch, but results from the way pipes are handled by the shell. Every element of a pipeline runs in a separate process. When the read command sets var1 and var2, is sets them it its own subshell, not the parent shell. So when the subshell exits, the values of var1 and var2 are lost. You can, however, try doing
var1=$(echo "Hello")
echo var1
which returns the expected answer. Unfortunately this only works for single variables, you can't set many at a time. In order to set multiple variables at a time you must either read into one variable and chop it up into multiple variables or use something like this:
set -- $(echo "Hello World")
var1="$1" var2="$2"
echo $var1
echo $var2
While I admit it's not as elegant as using a pipe, it works. Of course you should keep in mind that read was meant to read from files into variables, so making it read from standard input should be a little harder.
It's because the pipe version is creating a subshell, which reads the variable into its local space which then is destroyed when the subshell exits.
Execute this command
$ echo $$;cat | read a
10637
and use pstree -p to look at the running processes, you will see an extra shell hanging off of your main shell.
| |-bash(10637)-+-bash(10786)
| | `-cat(10785)
My take on this issue (using Bash):
read var1 var2 <<< "hello world"
echo $var1 $var2
Try:
echo "hello world" | (read var1 var2 ; echo $var1 ; echo $var2 )
The problem, as multiple people have stated, is that var1 and var2 are created in a subshell environment that is destroyed when that subshell exits. The above avoids destroying the subshell until the result has been echo'd. Another solution is:
result=`echo "hello world"`
read var1 var2 <<EOF
$result
EOF
echo $var1
echo $var2

Resources