How to split/filter events from perf.data? - perf

Question: Is there are a way to extract the samples for a specific event from a perf.data that has multiple events?
Context
I have a recorded sample of two events which I took by running something like
perf record -e cycles,instructions sleep 30
As far as I can see, other perf commands such as perf diff and perf script don't have an events filter flag. So it would be useful to split my perf.data into cycles.perf.data and instructions.perf.data.
Thanks!

Related

Snakemake limit the memory usage of jobs

I need to run 20 genomes with a snakemake. So I am using basic steps like alignment, markduplicates, realignment, basecall recalibration and so on in snakemake. The machine I am using has up to 40 virtual cores and 70G memory and I run the program like this.
snakemake -s Snakefile -j 40
This works fine, but as soon as It runs markduplicates along other programs, it stops as I think it overloads the 70 available giga and crashes.
Is there a way to set in snakemake the memory limit to 60G in total for all programs running? I would like snakemake runs less jobs in order to stay under 60giga, is some of the steps require a lot of memory. The command line below crashed as well and used more memorya than allocated.
snakemake -s Snakefile -j 40 --resources mem_mb=60000
It's not enough to specify --resources mem_mb=60000 on the command line, you need also to specify mem_mb for the rules you want to keep in check. E.g.:
rule markdups:
input: ...
ouptut: ...
resources:
mem_mb= 20000
shell: ...
rule sort:
input: ...
ouptut: ...
resources:
mem_mb= 1000
shell: ...
This will submit jobs in such way that you don't exceed a total of 60GB at any one time. E.g. this will keep running at most 3 markdups jobs, or 2 markdups jobs and 20 sort jobs, or 60 sort jobs.
Rules without mem_mb will not be counted towards memory usage, which is probably ok for rules that e.g. copy files and do not need much memory.
How much to assign to each rule is mostly up to your guess. top and htop commands help in monitoring jobs and figuring out how much memory they need. More elaborated solutions could be devised but I'm not sure it's worth it... If you use a job scheduler like slurm, the log files should give you the peak memory usage of each job so you can use them for future guidance. Maybe others have better suggestions.

Getting Linux prof samples even if my program is in sleep state?

With a program without sleep function, perf collects callgraph samples well.
void main()
{
while(true)
{
printf(...);
}
}
For example, more than 1,000 samples in a second.
I collected perf report with this:
sudo perf report -p <process_id> -g
However, when I do it with a program with sleep function, perf does not collect callgraph samples well: only a few samples in a second.
void main()
{
while(true)
{
sleep(1);
printf(...);
}
}
I want to collect the callgraph samples even if my program is in sleep state aka. device time. In Windows with VSPerf, callgraph with sleep state is also collected well.
Collecting callgraph for sleep state is needed for finding performance bottleneck not only in CPU time but also in device time (e.g. accessing database).
I guess there may be a perf option for collecting samples even if my program is in sleep state, because not only I but also many other programmers may want it.
How can I get the prof samples even if my program is in sleep state?
After posting this question, we found that perf -c 1 captures about 10 samples in a second. Without -c 1, perf captured 0.3 samples per second. 10 samples per second is much better for now, but it is still much less than 1000 samples per second.
Is there any better way?
CPU samples while your process is in the sleep state are mostly useless, but you could emulate this behavior by using an event that records the begin and end of the sleep syscall (capturing the stacks), and then just add the the "sleep stacks" yourself in "post processing" by duplicating the entry stack a number of times consistent with the duration of each sleep.
After all, the stack isn't going to change.
When you specify a profiling target, perf will only account for events that were generated by said target. Quite naturally, a sleep'ing target doesn't generate many performance events.
If you would like to see other processes (like a database?) in your callgraph reports, try system-wide sampling:
-a, --all-cpus
System-wide collection from all CPUs (default if no target is specified).
(from perf man page)
In addition, if you plan to spend a lot of time actually looking at the reports, there is a tool I cannot recommend you enough: FlameGraphs. This visualization may save you a great deal of effort.

Is it possible to raise the sampling frequency of perf stat?

I am using perf for profiling, but the number of monitored PMU events is higher than the number of the hardware counters, so round-robin multiplexing strategy is triggered. However, some of my test cases may run less than a millisecond, which means that if the execution time is less than the multiplicative inverse of the default switch frequency (1000Hz), some events may not be profiled.
How to raise the sampling frequency of perf stat like perf record -F <freqency> to make sure that every events will be recorded even if the measurement overhead may slightly increase?
First off, remember that sampling is different than counting.
perf record will invariably do a sampling of all the events that occured during the time period of profiling. This means that it will not count all of the events that happened (this can be tweaked of course!). You can modify the frequency of sample collection to increase the number of samples that get collected. It will usually be like for every 10 (or whatever number > 0) events that occur, perf record will only record 1 of them.
perf stat will do a counting of all the events that occur. For each event that happens, perf stat will count it and will try not to miss any, unlike sampling. Of course, the number of events counted may not be accurate if there is multiplexing involved (i.e. when the number of events measured is greater than the number of available hardware counters). There is no concept of setting up frequencies in perf stat since all it does is a direct count of all the events that you intend to measure.
This is the proof from the linux kernel source code :-
You can see it sets up sample period (the inverse of sample freq) to be 0 - so you know what sample freq is ;)
Anyway, what you can do is a verbose reading of perf stat using perf stat -v to see and understand what is happening with all of the events that you are measuring.
To understand more about perf stat, you can also read this answer.

perf report - what are the two percentage columns

Does anyone know where the perf report outputs are documented? Or in particular what the two percentage columns on the left are? I have found a number of examples showing a single percentage column but I get two. The commands I used are given below.
Many thanks!
perf record -g -a sleep 1
# Note: -a == all cpus -g enables backtrace recording for call graphs
perf report
+ 78.09% 0.00% node libc-2.19.so [.] __libc_start_main
+ 77.71% 0.00% node node [.] node::Start(int, char**)
...
Using -a does not make a lot of sense if you want to inspect the sleep command.
It is usually percentage of time spend in the function. The first one together with all the children (calls from the function) and the second the function itself without children.

Get `perf` to provide alphabetized list of functions

Despite providing perf report the --sort symbol option, it's always using the overhead as the primary sort key.
I'd like to get perf to provide me an alphabetized list of functions, with their overhead/measured ticks as the second column. Is this impossible without perf script?
I don't think it's possible (to be confirmed by perf experts) with perf directly. Nevertheless you can easily redirect perf report output to a file
perf report --stdio > out
Then remove the first "heading lines" prceeding the data and use sort:
sort -b -k 3 out

Resources