Argument sweep in gnu parallel - gnu-parallel

I would like to run a parameter sweep of the command line argument of a command. The command is:
mycommand --fixed arg 5 --variable_arg 0
and I would like to vary variable_arg from 0-100. How can I do this in a single command using gnu parallel, which generating a separate file with all the individual commands?

Maybe something like this:
parallel mycommand --fixed arg 5 --variable_arg {} ::: {0..100}
If you want the result in myout.1 .. myout.100 you can use one of these:
parallel --results myout.{} mycommand --fixed arg 5 --variable_arg {} ::: {0..100}
parallel mycommand --fixed arg 5 --variable_arg {} '>' myout.{} ::: {0..100}

Related

How to get GNU Parallel report every file processed?

I would like to keep track of GNU parallel in a simple log file and would like it to emit the name of each as it starts / ends (either or both are equally fine). It seems verbose is too verbose for this.
If you make a profile that does the logging:
echo 'echo {} >> my.log;' > ~/.parallel/log
Then you can do this:
parallel -J log seq {} ::: 1 2 3
But since the profile uses {} you need to mention {} explicitly.
THIS DOES NOT WORK:
parallel -J log seq ::: 1 2 3
If you are not looking for --joblog then please explain how your needs differ.
--joblog is covered in 7.7 (p. 59) in GNU Parallel 2018 (paper copy: http://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html or download it at: https://doi.org/10.5281/zenodo.1146014).

How to escape a $-sign in a dockerfile?

I am trying to write a dockerfile in which I add a few java-options to a script called envvars.
To achieve that I want to append a few text-lines to said file like so:
RUN echo "JAVA_OPTS=$JAVA_OPTS -Djavax.net.ssl.trustStore=${CERT_DIR}/${HOSTNAME}_truststore.jks" >> ${BIN_DIR}/envvars
RUN echo "JAVA_OPTS=$JAVA_OPTS -Djavax.net.ssl.trustStorePassword=${PWD_TRUSTSTORE}" >> ${BIN_DIR}/envvars
RUN echo "export JAVA_OPTS" >> ${BIN_DIR}/envvars
The issue here is, that I want the misc. placeholders ${varname} (those with curly braces) to be replaced during execution of the docker build command while the substring '$JAVA_OPTS' (i.e. those without braces) should be echoed and thus added to the envvars file verbatim, i.e. in the end the result in the /usr/local/apache2/bin/envvars file should read:
...
JAVA_OPTS=$JAVA_OPTS -Djavax.net.ssl.trustStore=/usr/local/apache2/cert/myserver_truststore.jks
JAVA_OPTS=$JAVA_OPTS -Djavax.net.ssl.trustStorePassword=my_secret
export JAVA_OPTS
How can one escape a $-sign from variable substitution in dockerfiles?
I found hints to use \$ or $$ but neither worked for me.
In case that matters (which I hope/expect not to): I am building the image using "Docker Desktop" on Windows 10 but I would expect the dockerfile to be agnostic of that.
first you need to add this # escape=` to your Dockerfile since \ is an escape charachter in the Dockerfile . then you can use \$ to escape the dollar sign in the RUN section
Example:
# escape=`
RUN echo "JAVA_OPTS=\$JAVA_OPTS -Djavax.net.ssl.trustStore=${CERT_DIR}/${HOSTNAME}_truststore.jks" >> ${BIN_DIR}/envvars
that will be JAVA_OPTS=$JAVA_OPTS in your env file

Executing bash script on multiple lines inside multiple files in parallel using GNU parallel

I want to use GNU parallel for the following problem:
I have a few files each with several lines of text. I would like to understand how I can run a script (code.sh) on each line of text of each file and for each file in parallel. I should be able to write out the output of the operation on each input file to an output file with a different extension.
Seems this is a case of multiple parallel commands running parallel over all files and then running parallel for all lines inside each file.
This is what I used:
ls mydata_* |
parallel -j+0 'cat {} | parallel -I ./explore-bash.sh > {.}.out'
I do not know how to do this using GNU parallel. Please help.
Your solution seems reasonable. You just need to remove -I:
ls mydata_* | parallel -j+0 'cat {} | parallel ./explore-bash.sh > {.}.out'
Depending on your setup this may be faster as it will only run n jobs, where as the solution above will run n*n jobs in parallel (n = number of cores):
ls mydata_* | parallel -j1 'cat {} | parallel ./explore-bash.sh > {.}.out'

run command taking two arguments with GNU parallel

I have a perl program that takes two arguments, dictionary file composed of
english words one per line, and file with concatenated words also one per
line, something like this:
lovetoplayguitar
...
...
So normally program is used like:
perl ./splitwords.pl words-en.txt bigfile.txt
It prints results to stdout.
I am trying to put it through GNU parallel like this:
time parallel -n 2 -j8 -k perl ./splitwords.pl {1} {2} ::: words-en.txt bigfile.txt > splitted.txt
but it doesn't work that way.. Tried many combinations so far but was unable
to run it using parallel.
EDIT
Actually this seems to be working, however it is using only one core..? Why..?
This will chop bigfile into 1 MB chunks:
cat bigfile.txt | parallel --pipe --cat -k perl ./splitwords.pl words-en.txt {}
If the perlscript only reads the file then this will be faster:
cat bigfile.txt | parallel --pipe --fifo -k perl ./splitwords.pl words-en.txt {}

bash gnu parallel argfile syntax

I just discovered GNU parallel and I'm having some trouble running a simple parallel task. I have a simulation running over multiple values and I'd like to split it up to run in parallel using command line args. From the docs , it seems you can run parallel mycommand :::: myargfile in which myargfile contains the various arguments you would like to feed your command, in parallel. However, I didn't see any information on how the args should be listed and assumed a myargfile like this would work:
--pmin 0 --pmax 0.1
--pmin 0.1 --pmax 0.2
...
mycommand --pmin 0 --pmax 0.1 executes no problem. But when I run parallel mycommand :::: myargfile I get error: unknown option pmin 0 --pmax 0.1 (caught and decoded courtesy boost program options). parallel echo :::: myargfile correctly prints out the arguments. It's as if they are being wrapped in a string which the program can't read and not fed like they are from a standard bash script.
What's going on? How can I make this work?
Following #DmitriChubarov's link to https://stackoverflow.com/a/6258206/1328439 , I discovered that I was lacking the colsep flag:
parallel --colsep ' ' mycommand :::: myargfile
successfully executes.
After digging through manual and help pages I came up with this example. Perhaps it will save someone out there. :)
#!/usr/bin/env bash
COMMANDS=(
"cnn -a mode=flat"
"cnn -a mode=xxx"
"cnn_x -a mode=extreme"
)
parallel --verbose --progress --colsep ' ' scrapy crawl {.} ::: "${COMMANDS[#]}"

Resources