GNU Parallel: thread id - gnu-parallel

In GNU Parallel with the -j option it is possible to specify the number concurrent jobs.
Is it possible to get an id of the thread running the job?. With thread id I mean a number from
1 to 12 on my machine with 12 threads. As of now I use the following workaround:
doit() {
let var=$1*12+$2
echo $var $2
}
export -f doit
for ((i=0;i<2;++i))
do
parallel -j12 doit ::: $i ::: {1..12}
done
This has the problem that every iteration of the loop waits for all 12 threads to finish.
I am only interested in not running iterations with the same thread id concurrently.
My motivation for this is that every thread uses a writelock on one of 12 files. I got exactly 12 files and if a thread on one file finishes, the next thread could immediately use this file again.

As #MarkSetchell writes you should use the replacement string {%} which gives the jobslot number:
parallel --line-buffer -j12 'echo starting job {#} on {%}; sleep {=$_=rand()*30=}; echo finishing job {#} on {%}' ::: {1..50}

Related

Limit total number of jobs within nested or concurrent GNU Parallel invocations

This is a continuation of this question and this question on nested GNU Parallel. Utlimately what I want to achieve is to leave my Makefile untouched except by changing the SHELL= variable and distribute jobs using parallel across all my machines.
Is there a way to ensure that concurrent executions of GNU Parallel respect the --jobs clause specified in the outer invocation? Or some other way to get a limit on the total number of jobs across parallel invocations? For example: I'd like the inner slot in the output below to always be 1, i.e., the slot 1-2 on line three of the output violates the condition.
~• inner_par="parallel -I // --slotreplace '/%/' --seqreplace '/#/'"
~• cmd='echo id {#}-/#/, slot {%}-/%/, arg {}-//'
~• seq 2 | parallel -j 1 "seq {} | $inner_par $cmd"
id 1-1, slot 1-1, arg 1-1
id 2-1, slot 1-1, arg 2-1
id 2-2, slot 1-2, arg 2-2
~•
Are you looking for sem?
parallel -j 10 parallel -j 20 sem -j 30 --id myid mycmd
This will start 200 sems, but only run 30 mycmds in parallel.

How to get GNU Parallel report every file processed?

I would like to keep track of GNU parallel in a simple log file and would like it to emit the name of each as it starts / ends (either or both are equally fine). It seems verbose is too verbose for this.
If you make a profile that does the logging:
echo 'echo {} >> my.log;' > ~/.parallel/log
Then you can do this:
parallel -J log seq {} ::: 1 2 3
But since the profile uses {} you need to mention {} explicitly.
THIS DOES NOT WORK:
parallel -J log seq ::: 1 2 3
If you are not looking for --joblog then please explain how your needs differ.
--joblog is covered in 7.7 (p. 59) in GNU Parallel 2018 (paper copy: http://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html or download it at: https://doi.org/10.5281/zenodo.1146014).

Waiting for dependencies when running in parallel

When running doit with -n option, the task doesn't seem to wait for the file_dep to be created by other sub-tasks. Here is a simple code that tries to show the issue:
def task_pexample():
yield {
"name":"test1",
"actions": ["sleep 5", "touch tmp.txt"],
"targets":["tmp.txt"]
}
yield {
"name":"test2",
"file_dep":["tmp.txt"],
"targets":["tmp2.txt"],
"actions": ["cp tmp.txt tmp2.txt"],
}
doit -s pexample runs without a problem. However when running doit -n 2 -s pexample, it starts the second subtask right away, without waiting for the first one to complete. This then generates an error since the file doesn't exist.
My question is whether doit does or doesn't look for such dependencies in subtask when running in parallel?
Yes. doit looks for dependencies when executing in parallel.
The problem with your example is that you are using the -s/--single command line option. -s instructs doit to execute the specified task ignoring the dependencies.
-s is supposed to be used on special occasions while developing your task's code, not be used during standard execution.

Vivado Synthesis hangs in Docker container spawned by Jenkins

I'm attempting to move our large FPGA build into a Jenkins CI environment, but the build hangs at the end of synthesis when run in a Docker container spawned by Jenkins.
I've attempted to replicate the environment that Jenkins is creating, but when I spawn a Docker container myself, there's no issue with the build.
I've tried:
reducing the number of jobs (aka threads) that Vivado uses, thinking
that perhaps there was some thread collision occurring when writing
out log files
on the same note, used the -nolog -nojournal options on the vivado
commands to remove any log file collisions
taking control of the cloned/checked-out project and running commands
as the local user in the Docker container
I also have an extremely small build that makes it through the entire build process in Jenkins with no issue, so I don't think there is a fundamental flaw with my Docker containers.
agent {
docker {
image "vivado:2017.4"
args """
-v <MOUNT XILINX LICENSE FILE>
--dns <DNS_ADDRESS>
--mac-address <MAC_ADDRESS>
"""
}
}
steps {
sh "chmod -R 777 ."
dir(path: "${params.root_dir}") {
timeout(time: 15, unit: 'MINUTES') {
// Create HLS IP for use in Vivado project
sh './run_hls.sh'
}
timeout(time: 20, unit: 'MINUTES') {
// Create vivado project, add sources, constraints, HLS IP, generated IP
sh 'source source_vivado.sh && vivado -mode batch -source tcl/setup_proj.tcl'
}
timeout(time: 20, unit: 'MINUTES') {
// Create block designs from TCL scripts
sh 'source source_vivado.sh && vivado -mode batch -source tcl/run_bd.tcl'
}
timeout(time: 1, unit: 'HOURS') {
// Synthesize complete project
sh 'source source_vivado.sh && vivado -mode batch -source tcl/run_synth.tcl'
}
}
}
This code block below was running 1 job with a 12 hour timeout. You can see that Synthesis finished, then a timeout occurred 8 hours later.
[2019-04-17T00:30:06.131Z] Finished Writing Synthesis Report : Time (s): cpu = 00:01:53 ; elapsed = 00:03:03 . Memory (MB): peak = 3288.852 ; gain = 1750.379 ; free physical = 332 ; free virtual = 28594
[2019-04-17T00:30:06.131Z] ---------------------------------------------------------------------------------
[2019-04-17T00:30:06.131Z] Synthesis finished with 0 errors, 0 critical warnings and 671 warnings.
[2019-04-17T08:38:37.742Z] Sending interrupt signal to process
[2019-04-17T08:38:43.013Z] Terminated
[2019-04-17T08:38:43.013Z]
[2019-04-17T08:38:43.013Z] Session terminated, killing shell... ...killed.
[2019-04-17T08:38:43.013Z] script returned exit code 143
Running the same commands in locally spawned Docker containers has no issues whatsoever. Unfortunately, the timeout Jenkins step doesn't appear to flush open buffers, as my post:unsuccesful step that prints out all log files doesn't find synth_1, though I wouldn't expect there to be anything different from the Jenkins capture.
Are there any known issues with Jenkins/Vivado integration? Is there a way to enter a Jenkins spawned container so I can try and duplicate what I'm expecting vs what I'm experiencing?
EDIT: I've since added in a timeout in the actual tcl scripts to move past the wait_on_runs command used in run_synth.tcl, but now I'm experiencing the same hanging behavior during implementation.
The problem lies in the way vivado deals (or doesn't deal...) with its forked processes. Specifically I think this applies to the parallel synthesis. This is maybe, why you only see it in some of your projects. In the state you describe above (stuck after "Synthesis finished") I noticed a couple of abandoned zombie processes of vivado. To my understanding these are child processes which ended, but the parent didn't collect the status before ending themselves. Tracing with strace even reveals that vivado tries to kill these processes:
restart_syscall(<... resuming interrupted nanosleep ...>) = 0
kill(319, SIG_0) = 0
kill(370, SIG_0) = 0
kill(422, SIG_0) = 0
kill(474, SIG_0) = 0
nanosleep({tv_sec=5, tv_nsec=0}, 0x7f86edcf4dd0) = 0
kill(319, SIG_0) = 0
kill(370, SIG_0) = 0
kill(422, SIG_0) = 0
kill(474, SIG_0) = 0
nanosleep({tv_sec=5, tv_nsec=0}, <detached ...>
But (as we all know) you can't kill zombies, they are already dead...
Normally these processes would be adopted by the init process and handled there. But in the case of Jenkins Pipeline in Docker there is no init by default. The pipeline spawns the container and runs cat with no inputs to keep it alive. This way cat becomes pid 1 and takes the abandoned children of vivado. cat of course doesn't know what do do with them and ignores them (a tragedy really).
cat,1
|-(sh,16)
|-sh,30 -c ...
| |-sh,31 -c ...
| | `-sleep,5913 3
| `-sh,32 -xe /home/user/.jenkins/workspace...
| `-sh,35 -xe /home/user/.jenkins/workspace...
| `-vivado,36 /opt/Xilinx/Vivado/2019.2/bin/vivado -mode tcl ...
| `-loader,60 /opt/Xilinx/Vivado/2019.2/bin/loader -exec vivado -mode tcl ...
| `-vivado,82 -mode tcl ...
| |-{vivado},84
| |-{vivado},85
| |-{vivado},111
| |-{vivado},118
| `-{vivado},564
|-(vivado,319)
|-(vivado,370)
|-(vivado,422)
`-(vivado,474)
Luckily there is a way to have an init process in the docker container. Passing the --init argument with the docker run solves the problem for me.
agent {
docker {
image 'vivado:2019.2'
args '--init'
}
}
This creates the init process vivado seems to rely on and the build runs without problems.
Hope this helps you!
Cheers!

How can I stop gnu parallel jobs when any one of them terminates?

Suppose I am running N jobs with the following gnu parallel command:
seq $N | parallel -j 0 --progress ./job.sh
How can I invoke parallel to kill all running jobs and accept no more as soon as any one of them exits?
You can use --halt:
seq $N | parallel -j 0 --halt 2 './job.sh; exit 1'
A small problem with that solution is that you cannot tell if job.sh failed.
You may also use killall perl. It's not accurate way, but easy to remember

Resources