Replacement string not working in GNU parallel - gnu-parallel

I have the script run_md.py which produces the file test.dcd from the input file named test.pdb.
I want to execute the same command on multiple input files (test*.pdb) on a remote server using GNU parallel and transfer the result back to the local computer. Therefore, I'm using the following command:
parallel --trc {.}.dcd -j 2 -S $SERVER1 './run_md.py {} 1000' ::: test*.pdb
The command is running as expected on the server using 2 slots. However, the files are not transferred back and I get the following error:
rsync: link_stat "/home/bougui/{.}.dcd" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.1]
It looks like the replacement string is not working. How can I make it works?
Below is the output of parallel --version:
GNU parallel 20130922
Copyright (C) 2007,2008,2009,2010,2011,2012,2013 Ole Tange and Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
GNU parallel comes with no warranty.
Web site: http://www.gnu.org/software/parallel
When using GNU Parallel for a publication please cite:
O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
;login: The USENIX Magazine, February 2011:42-47.

What you are doing is 100% correct. So something on your system is breaking this. Please try this on another system and if possible follow REPORTING BUGS from man parallel.

The bug reported in that thread has been fixed and this feature works well with the latest version of GNU parallel (20160622). The GNU parallel version 20130922 packaged with Debian 8.5 is buggy for the usage of {.} string replacement, as described below:
With more test I found that the output file must be specified with a replacement string in the command run in parallel.
For testing purpose, you can find below a complete example that others can run:
echo This is input_file > input_file && parallel --trc {}.out -S $SERVER1 cat {} ">"{}.out ::: input_file
The example above works well. When I use the substitution string {.} as below:
echo This is input_file > input_file.in && parallel --trc {.}.out -S $SERVER1 cat {} ">"{.}.out ::: input_file
It works, as well. However, if I didn't specify {.}.out in the command run in parallel as below:
echo This is input_file > input_file.in && parallel --trc {.}.out -S $SERVER1 cat {} ">"input_file.out ::: input_file
... I reproduce the error:
rsync: link_stat "/home/bouvier/{.}.out" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.1]
rsync: [Receiver] write error: Broken pipe (32)
Therefore the output file must be specified in the command run in parallel.

Related

How to get GNU Parallel report every file processed?

I would like to keep track of GNU parallel in a simple log file and would like it to emit the name of each as it starts / ends (either or both are equally fine). It seems verbose is too verbose for this.
If you make a profile that does the logging:
echo 'echo {} >> my.log;' > ~/.parallel/log
Then you can do this:
parallel -J log seq {} ::: 1 2 3
But since the profile uses {} you need to mention {} explicitly.
THIS DOES NOT WORK:
parallel -J log seq ::: 1 2 3
If you are not looking for --joblog then please explain how your needs differ.
--joblog is covered in 7.7 (p. 59) in GNU Parallel 2018 (paper copy: http://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html or download it at: https://doi.org/10.5281/zenodo.1146014).

Get Lua running with Torch on Windows 10 (with limited admin rights)

Setting up Deep Learning Framework [Lua, Torch]:
I need to set up Lua running with Torch
on Windows 10 and the ZeroBrane IDE, with limited possibilities of installing Software and restricted download rights.
It took me so Long, so I thought I might share a recipe for you guys. I would be glad if it helped you.
SETTING UP
(Admin) Download/Install tdm64/gcc/5.1.0-2.exe Compiler
(Admin) Download/Install ZeroBrane (Lua IDE)
Download lua/5.3.4.tar.gz (https://www.lua.org/download.html)
Write batch file build.cmd
#echo off
setlocal
:: you may change the following variable's value
:: to suit the downloaded version
set lua_version=5.3.4
set work_dir=%~dp0
:: Removes trailing backslash
:: to enhance readability in the following steps
set work_dir=%work_dir:~0,-1%
set lua_install_dir=%work_dir%\lua
set compiler_bin_dir=%work_dir%\tdm-gcc\bin
set lua_build_dir=%work_dir%\lua-%lua_version%
set path=%compiler_bin_dir%;%path%
cd /D %lua_build_dir%
mingw32-make PLAT=mingw
echo.
echo **** COMPILATION TERMINATED ****
echo.
echo **** BUILDING BINARY DISTRIBUTION ****
echo.
:: create a clean "binary" installation
mkdir %lua_install_dir%
mkdir %lua_install_dir%\doc
mkdir %lua_install_dir%\bin
mkdir %lua_install_dir%\include
copy %lua_build_dir%\doc\*.* %lua_install_dir%\doc\*.*
copy %lua_build_dir%\src\*.exe %lua_install_dir%\bin\*.*
copy %lua_build_dir%\src\*.dll %lua_install_dir%\bin\*.*
copy %lua_build_dir%\src\luaconf.h %lua_install_dir%\include\*.*
copy %lua_build_dir%\src\lua.h %lua_install_dir%\include\*.*
copy %lua_build_dir%\src\lualib.h %lua_install_dir%\include\*.*
copy %lua_build_dir%\src\lauxlib.h %lua_install_dir%\include\*.*
copy %lua_build_dir%\src\lua.hpp %lua_install_dir%\include\*.*
echo.
echo **** BINARY DISTRIBUTION BUILT ****
echo.
%lua_install_dir%\bin\lua.exe -e"print [[Hello!]];print[[Simple Lua test successful!!!]]"
echo.
pause
SETTING UP TORCH UNDER LUA ON WINDOWS
--- Quick and dirty ---
Download and unzip the desired binary build from: https://github.com/hiili/WindowsTorch
Generate user.lua file in C:\Users\Name.zbstudio:
path.lua = [[C:\app\tools\torch\bin\luajit.exe]]
Move the C:\app\tools\torch\lua folder to C:\app\tools\torch\bin
--- Untested alternatives ---
Not tested, but I encourage you: https://github.com/torch/torch7/wiki/Windows#cmder
Maybe second best option is to build a virtual environment with linux
Note: More information on Torch can be found here
https://github.com/soumith/cvpr2015/blob/master/cvpr-torch.pdf
GET STARTED WITH LUA AND TORCH
http://torch.ch/docs/tutorials.html
I recommend Torch Video Tutorials to get the basics straight (https://github.com/Atcold/torch-Video-Tutorials)
This is a Torch Cheetsheet for further reading (https://github.com/torch/torch7/wiki/Cheatsheet):
- Newbies
- Installing and Running Torch
- Installing Packages
- Tutorials, Demos by Category
- Loading popular datasets
- List of Packages by Category

Fortify, how to start analysis through command

How we can generate FortiFy report using command ??? on linux.
In command, how we can include only some folders or files for analyzing and how we can give the location to store the report. etc.
Please help....
Thanks,
Karthik
1. Step#1 (clean cache)
you need to plan scan structure before starting:
scanid = 9999 (can be anything you like)
ProjectRoot = /local/proj/9999/
WorkingDirectory = /local/proj/9999/working
(this dir is huge, you need to "rm -rf ./working && mkdir ./working" before every scan, or byte code piles underneath this dir and consume your harddisk fast)
log = /local/proj/9999/working/sca.log
source='/local/proj/9999/source/src/**.*'
classpath='local/proj/9999/source/WEB-INF/lib/*.jar; /local/proj/9999/source/jars/**.*; /local/proj/9999/source/classes/**.*'
./sourceanalyzer -b 9999 -Dcom.fortify.sca.ProjectRoot=/local/proj/9999/ -Dcom.fortify.WorkingDirectory=/local/proj/9999/working -logfile /local/proj/working/9999/working/sca.log -clean
It is important to specify ProjectRoot, if not overwrite this system default, it will put under your /home/user.fortify
sca.log location is very important, if fortify does not find this file, it cannot find byte code to scan.
You can alter the ProjectRoot and Working Directory once for all if your are the only user: FORTIFY_HOME/Core/config/fortify_sca.properties).
In such case, your command line would be ./sourceanalyzer -b 9999 -clean
2. Step#2 (translate source code to byte code)
nohup ./sourceanalyzer -b 9999 -verbose -64 -Xmx8000M -Xss24M -XX:MaxPermSize=128M -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC -XX:+UseParallelGC -Dcom.fortify.sca.ProjectRoot=/local/proj/9999/ -Dcom.fortify.WorkingDirectory=/local/proj/9999/working -logfile /local/proj/9999/sca.log -source 1.5 -classpath '/local/proj/9999/source/WEB-INF/lib/*.jar:/local/proj/9999/source/jars/**/*.jar:/local/proj/9999/source/classes/**/*.class' -extdirs '/local/proj/9999/source/wars/*.war' '/local/proj/9999/source/src/**/*' &
always unix background job (&) in case your session to server is timeout, it will keep working.
cp : put all your known classpath here for fortify to resolve the functiodfn calls. If function not found, fortify will skip the source code translation, so this part will not be scanned later. You will get a poor scan quality but FPR looks good (low issue reported). It is important to have all dependency jars in place.
-extdir: put all directories/files you don't want to be scanned here.
the last section, files between ' ' are your source.
-64 is to use 64-bit java, if not specified, 32-bit will be used and the max heap should be <1.3 GB (-Xmx1200M is safe).
-XX: are the same meaning as in launch application server. only use these to control the class heap and garbage collection. This is to tweak performance.
-source is java version (1.5 to 1.8)
3. Step#3 (scan with rulepack, custom rules, filters, etc)
nohup ./sourceanalyzer -b 9999 -64 -Xmx8000M -Dcom.fortify.sca.ProjectRoot=/local/proj/9999 -Dcom.fortify.WorkingDirectory=/local/proj/9999/working -logfile /local/ssap/proj/9999/working/sca.log **-scan** -filter '/local/other/filter.txt' -rules '/local/other/custom/*.xml -f '/local/proj/9999.fpr' &
-filter: file name must be filter.txt, any ruleguid in this file will not be reported.
rules: this is the custom rule you wrote. the HP rulepack is in FORTIFY_HOME/Core/config/rules directory
-scan : keyword to tell fortify engine to scan existing scanid. You can skip step#2 and only do step#3 if you did notchange code, just want to play with different filter/custom rules
4. Step#4 Generate PDF from the FPR file (if required)
./ReportGenerator -format pdf -f '/local/proj/9999.pdf' -source '/local/proj/9999.fpr'

GNU Parallel: suppress warning when input is read from terminal

When input is read from terminal, GNU Parallel always displays a warning:
parallel: Warning: Input is read from the terminal. Only experts do this on purpose. Press CTRL-D to exit.
But sometimes I do want to read from terminal (e.g., when I'm copy & pasting stuff from elsewhere entry by entry). Is it possible to turn off this warning? I couldn't find such an option in man parallel or man parallel_tutorial.
Note that I don't want a cheap solution like 2>/dev/null, since warning messages from other programs will be turned off, too. For instance, consider the following simple script:
#!/bin/bash
function print12 () {
echo "printing $1 to stdout"
echo "printing $1 to stderr" >/dev/stderr
}
export -f print12
SHELL=/bin/bash parallel -k print12 2>/dev/null
Messages printed to stderr will all be suppressed.
Just realized that I can do a cat or some read </dev/tty to achieve my desired effect. But let's just focus on the original question.
It cannot be turned off. But see it as a praise: Since you are doing it on purpose, you are an expert (at least in the eyes of GNU Parallel).
As it is just a warning, you are free to paste your arguments and have them run: The warning does not stop GNU Parallel from reading your input.
If you really do not like the warning:
cat | parallel ...

bash gnu parallel argfile syntax

I just discovered GNU parallel and I'm having some trouble running a simple parallel task. I have a simulation running over multiple values and I'd like to split it up to run in parallel using command line args. From the docs , it seems you can run parallel mycommand :::: myargfile in which myargfile contains the various arguments you would like to feed your command, in parallel. However, I didn't see any information on how the args should be listed and assumed a myargfile like this would work:
--pmin 0 --pmax 0.1
--pmin 0.1 --pmax 0.2
...
mycommand --pmin 0 --pmax 0.1 executes no problem. But when I run parallel mycommand :::: myargfile I get error: unknown option pmin 0 --pmax 0.1 (caught and decoded courtesy boost program options). parallel echo :::: myargfile correctly prints out the arguments. It's as if they are being wrapped in a string which the program can't read and not fed like they are from a standard bash script.
What's going on? How can I make this work?
Following #DmitriChubarov's link to https://stackoverflow.com/a/6258206/1328439 , I discovered that I was lacking the colsep flag:
parallel --colsep ' ' mycommand :::: myargfile
successfully executes.
After digging through manual and help pages I came up with this example. Perhaps it will save someone out there. :)
#!/usr/bin/env bash
COMMANDS=(
"cnn -a mode=flat"
"cnn -a mode=xxx"
"cnn_x -a mode=extreme"
)
parallel --verbose --progress --colsep ' ' scrapy crawl {.} ::: "${COMMANDS[#]}"

Resources