I want to do some comparison between the outputs of perf-stat to that of likwid-perfctr. Is there a way to do that. I tried running two commands, one for perf-stat, and the other for liquid-perfctr.
The commands are:
sudo perf stat -C 2 -e instructions, BR_INST_RETIRED.ALL_BRANCHES,branches,rc004,INST_RETIRED.ANY ./loop
sudo likwid-perfctr -C 2 -g MYLIST1 -f ./loop
The first instruction is related to perf-stat which captures importantly branches, and instructions count redundantly. The second instruction is related to likwid-perfctr which captures similar data. Just to mention I wrote my own group called MYLIST1 for likwid-perfctr.
But when I compare both the results, its turning out to be quite different.
Output Comparison
So, when we look into the output, INSTR_RETIRED_ANY in perf stat are: 15552, to that of likwid-perfctr are: 190594. And branches are: 3168 vs 42744.
I'm not sure what I'm doing wrong. Or is there any way to properly do that.
Related
I am trying to submit a job on high compute cluster that needs to run a python code lets say 10000 times. I used gnu parallel but then IT team sent me a mail stating that my job is creating too many ssh login logs in their monitoring system. They asked me to use job arrays instead. My code takes about 12 seconds to run. I believe I need to use #PBS -J statement in my PBS script. Then, I am not sure if it will run in parallel. I need to execute my code lets say on 10 nodes 16 cores each i.e. 160 instances of my code running in parallel. How can I parallelize it i.e. run many instances of my code at a given time utilizing all the resources I have?
Below is the initial pbs script with gnu parallel:
#!/bin/bash
#PBS -P My_project
#PBS -N my_job
#PBS -l select=10:ncpus=16:mem=4GB
#PBS -l walltime=01:30:00
module load anaconda
module load parallel
cd $PBS_O_WORKDIR
JOBSPERNODE=16
parallel --joblog jobs.log --wd $PBS_O_WORKDIR -j $JOBSPERNODE --sshloginfile $PBS_NODEFILE --env PATH "python $PBS_O_WORKDIR/xyz.py" :::: inputs.txt
inputs.txt is a fie with integer values 0-9999 in each line which is fed to my python code as an argument. Code is highly independent and output of one instance does not affect another.
a little late but thought I'd answer anyway.
Arrays will run in parallel, but the number of jobs running at once will depend on the availability of nodes and the limit of jobs per user per queue. Essentially, each HPC will be slightly different.
Adding #PBS -J 1-10000 will create an array of 10000 jobs, and assuming the syntax is the same as the HPC I use, something like ID=$(sed -n "${PBS_ARRAY_INDEX}p" /path/to/inputs.txt) will then be the integers from inputs.txt whereby PBS array number 123 will return the 123rd line of inputs.txt.
Alternatively, since you're on an HPC, if the jobs are only taking 12 seconds each, and you have 10000 iterations, then a for loop will also complete the entire process in 33.33 hours.
Dipping my toes into Bash coding for the first time (not the most experienced person with Linux either) and I'm trying to read the version from the version.php inside a container at:
/config/www/nextcloud/version.php
To do so, I run:
docker exec -it 1c8c05daba19 grep -eo "(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?" /config/www/nextcloud/version.php
This uses a semantic versioning RegEx pattern (I know, a bit overkill, but it works for now) to read and extract the version from the line:
$OC_VersionString = '20.0.1';
However, when I run the command it tells me No such file or directory, (I've confirmed it does exist at that path inside the container) and then proceeds to spit out the entire contents of the file it just said doesn't exist?
grep: (0|[1-9]\d*).(0|[1-9]\d*).(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-])(?:.(?:0|[1-9]\d|\d*[a-zA-Z-][0-9a-zA-Z-]))))?(?:+([0-9a-zA-Z-]+(?:.[0-9a-zA-Z-]+)*))?: No such file or directory
/config/www/nextcloud/version.php:$OC_Version = array(20,0,1,1);
/config/www/nextcloud/version.php:$OC_VersionString = '20.0.1';
/config/www/nextcloud/version.php:$OC_Edition = '';
/config/www/nextcloud/version.php:$OC_VersionCanBeUpgradedFrom = array (
/config/www/nextcloud/version.php: 'nextcloud' =>
/config/www/nextcloud/version.php: 'owncloud' =>
/config/www/nextcloud/version.php:$vendor = 'nextcloud';
Anyone able to spot the problem?
Update 1:
For the sake of clarity, I'm trying to run this from a bash script. I just want to fetch the version number from that file, to use it in other areas of the script.
Update 2:
Responding to the comments, I tried to login to the container first, and then run the grep, and still get the same result. Then I cat that file and it shows it's contents no problem.
Many containers don't have the GNU versions of Unix tools and their various extensions. It's popular to base containers on Alpine Linux, which in turn uses a very lightweight single-binary tool called BusyBox to provide the base tools. Those tend to have the set of options required in the POSIX specs, and no more.
POSIX grep(1) in particular doesn't have an -o option. So the command you're running is
grep \
-eo \ # specify "o" as the regexp to match
"(regexps are write-only)" \ # a filename
/config/www/nextcloud/version.php # a second filename
Notice that the grep output in the interactive shell only contains lines with the letter "o", but not for example the line just containing array.
POSIX grep doesn't have an equivalent for GNU grep's -o option
Print only the matched (non-empty) parts of matching lines, with each such part on a separate output line. Output lines use the same delimiters as input....
but it's easy to do that with sed(1) instead. Ask it to match some stuff, the regexp in question, and some stuff, and replace it with the matched group.
sed -e 's/.*\(any regexp here\).*/\1/' input-file
(POSIX sed only accepts basic regular expressions, so you'll have to escape more of the parentheses.)
Well, for any potential future readers, I had no luck getting grep to do it, I'm sure it was my fault somehow and not grep's, but thanks to the help in this post I was able to use awk instead of grep, like so:
docker exec -it 1c8c05daba19 awk '/^\$OC_VersionString/ && match($0,/\047[0-9]+\.[0-9]+\.[0-9]+\047/){print substr($0,RSTART+1,RLENGTH-2)}' /config/www/nextcloud/version.php
That ended up doing exactly what I needed:
It logs into a docker container.
Scans and returns just the version number from the line I am looking for at: /config/www/nextcloud/version.php inside the container.
Exits stage left from the container with just the info I needed.
I can get right back to eating my Hot Cheetos.
When I use a specific grep command from my history, it gives me no results. However, what appears to be the same command from a different spot in my history gives me thousands of lines of results. Below is a copy and paste from my command history. 519 gives me nothing while 520 yields many results. Also, just typing in the command gives me many results.
519 grep -r -i “agm/core”
520 grep -r -i "agm/core"
I am using Git Bash on Windows 10. git version 2.16.2.windows.1
The only thing I can think of is that at some point I used --exclude-dir, so if that flag persisted it could mess with one command and not others? However, I saw nothing about that in the man pages.
Quick question: what is the compiler flag to allow g++ to spawn multiple instances of itself in order to compile large projects quicker (for example 4 source files at a time for a multi-core CPU)?
You can do this with make - with gnu make it is the -j flag (this will also help on a uniprocessor machine).
For example if you want 4 parallel jobs from make:
make -j 4
You can also run gcc in a pipe with
gcc -pipe
This will pipeline the compile stages, which will also help keep the cores busy.
If you have additional machines available too, you might check out distcc, which will farm compiles out to those as well.
There is no such flag, and having one runs against the Unix philosophy of having each tool perform just one function and perform it well. Spawning compiler processes is conceptually the job of the build system. What you are probably looking for is the -j (jobs) flag to GNU make, a la
make -j4
Or you can use pmake or similar parallel make systems.
People have mentioned make but bjam also supports a similar concept. Using bjam -jx instructs bjam to build up to x concurrent commands.
We use the same build scripts on Windows and Linux and using this option halves our build times on both platforms. Nice.
If using make, issue with -j. From man make:
-j [jobs], --jobs[=jobs]
Specifies the number of jobs (commands) to run simultaneously.
If there is more than one -j option, the last one is effective.
If the -j option is given without an argument, make will not limit the
number of jobs that can run simultaneously.
And most notably, if you want to script or identify the number of cores you have available (depending on your environment, and if you run in many environments, this can change a lot) you may use ubiquitous Python function cpu_count():
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.cpu_count
Like this:
make -j $(python3 -c 'import multiprocessing as mp; print(int(mp.cpu_count() * 1.5))')
If you're asking why 1.5 I'll quote user artless-noise in a comment above:
The 1.5 number is because of the noted I/O bound problem. It is a rule of thumb. About 1/3 of the jobs will be waiting for I/O, so the remaining jobs will be using the available cores. A number greater than the cores is better and you could even go as high as 2x.
make will do this for you. Investigate the -j and -l switches in the man page. I don't think g++ is parallelizable.
distcc can also be used to distribute compiles not only on the current machine, but also on other machines in a farm that have distcc installed.
I'm not sure about g++, but if you're using GNU Make then "make -j N" (where N is the number of threads make can create) will allow make to run multple g++ jobs at the same time (so long as the files do not depend on each other).
GNU parallel
I was making a synthetic compilation benchmark and couldn't be bothered to write a Makefile, so I used:
sudo apt-get install parallel
ls | grep -E '\.c$' | parallel -t --will-cite "gcc -c -o '{.}.o' '{}'"
Explanation:
{.} takes the input argument and removes its extension
-t prints out the commands being run to give us an idea of progress
--will-cite removes the request to cite the software if you publish results using it...
parallel is so convenient that I could even do a timestamp check myself:
ls | grep -E '\.c$' | parallel -t --will-cite "\
if ! [ -f '{.}.o' ] || [ '{}' -nt '{.}.o' ]; then
gcc -c -o '{.}.o' '{}'
fi
"
xargs -P can also run jobs in parallel, but it is a bit less convenient to do the extension manipulation or run multiple commands with it: Calling multiple commands through xargs
Parallel linking was asked at: Can gcc use multiple cores when linking?
TODO: I think I read somewhere that compilation can be reduced to matrix multiplication, so maybe it is also possible to speed up single file compilation for large files. But I can't find a reference now.
Tested in Ubuntu 18.10.
I stopped a process to troubleshoot something. Now, I would like to start the process where it left off in CentOS 6.4.
This script I stopped runs a perl script in a loop to process all of the files in /dev/shm/split/. These files were split into many parts from a larger file. An example of how they are named are as follows:
file.txt.aa
file.txt.ab
file.txt.ac
...and so on.
I have identified that the script left off at file.txt.fy. So, I would like to remove all of the files in /dev/shm/split/ that are from file.txt.aa through file.txt.fy.
I tried to create a whitelist for the rm command by doing:
ls /dev/shm/split/ > whitelist
cat whitelist | egrep -v 'file.txt.[aa-fz]' | tee whitelist.tmp
This did not do what I had intended.
Please help me! Thank you!
The problem with your command is that you cannot match two characters with the square bracket pattern in bash. You should use something like that instead:
ls file.txt.[a-e]? file.txt.f[a-y]
Basically decompose your range into two ranges, the first will match .aa to .ez, and the second .fa to .fy (included).
Note that I have used the ls command here. I always find it a good idea to first echo or ls the commands/files you're going to run when the operations you do are potentially destructive. When you're sure it produces the right output, go on and use rm instead of ls.