Get `perf` to provide alphabetized list of functions - perf

Despite providing perf report the --sort symbol option, it's always using the overhead as the primary sort key.
I'd like to get perf to provide me an alphabetized list of functions, with their overhead/measured ticks as the second column. Is this impossible without perf script?

I don't think it's possible (to be confirmed by perf experts) with perf directly. Nevertheless you can easily redirect perf report output to a file
perf report --stdio > out
Then remove the first "heading lines" prceeding the data and use sort:
sort -b -k 3 out

Related

wrong times tamps in netflow data generated by ESXi

I have a problem in "Date first seen" column in the result generated by nfdump. I have enabled netflow on an ESXi 5.5 to send netflow data to a netflow server. up to now everything is OK and I can capture netflow data with nfcapd with the following command:
nfcapd -D -z -u netflow -p 9996 -n Esxi,192.168.20.54,/data/nfdump -S2 -e
but the problem is that when I filter the traffic with nfdump (e.g. with nfdump -R nfdump5/2016/ -c 10) I see "1970-01-01 03:30:00.000" for "Date first seen" column in all entries!!! What should I do to get the right time stamps?
Any help is appreciated.
The NetFlow header has a timestamp for the whole datagram; most likely, their export is using the "first seen" field as an offset from that. It's possible nfdump isn't correctly interpreting that field; I'd recommend having a look at the capture in Wireshark, which I've found to be pretty reliable in decoding NetFlow. That will also let you examine the flow records directly to see if the timestamps are really coming across that small, or are just being misinterpreted. Just remember that if you're capturing NetFlow v9 or IPFIX, you'll need to make sure that your capture includes a template datagram.
If the ESXi's NetFlow isn't exporting timestamps correctly, you can also look into monitoring using a small virtual machine running a software flow exporter (there are a number of free ones available - just Google "free flow exporter") with an interface in promiscuous mode.

Does awk store an output file in RAM?

I was doing a simple parsing like awk '{print $3 > "file1.txt"}'
What I noticed was that awk is taking up too much of RAM (the files were huge). Does streaming awk output to file consume memory? Does this work like stream write or does the file remain open till the program terminates?
The exact command that I gave was:
for i in ../../*.txt; do j=${i#*/}; mawk -v f=${j%.txt} '{if(NR%8<=4 && NR%8!=0){print >f"_1.txt" } else{print >f"_2.txt"}}' $i & done
As evident I used mawk. The five input files were around 6GB each and when I ran top I saw 22% memory ~5GB being taken up by each mawk process at its peak. I noticed it because my system was hanging because of low memory.
I am particularly sure that redirection outside awk consumes negligible memory. Have done it several times with files much larger than this and operations more complex than this; I never faced this problem. Since I had to copy different sections of the input files to different output files, I used redirection inside awk.
I know there are other ways to implement this task and in any case my job is done without much issues. All I was interested in is how awk works when writing to a file.
I am not sure if this question is better suited for Superuser.

How to split/filter events from perf.data?

Question: Is there are a way to extract the samples for a specific event from a perf.data that has multiple events?
Context
I have a recorded sample of two events which I took by running something like
perf record -e cycles,instructions sleep 30
As far as I can see, other perf commands such as perf diff and perf script don't have an events filter flag. So it would be useful to split my perf.data into cycles.perf.data and instructions.perf.data.
Thanks!

Completely restore a binary from memory?

I want to know if it's possible to completely restore the binary running in memory.
This is what I've tried,
First read /proc/PID/maps, then dump all relevant sections with gdb (ignore all libraries).
grep sleep /proc/1524/maps | awk -F '[- ]' \
'{print "dump memory sleep." $1 " 0x" $1 " 0x" $2 }' \
| gdb -p 1524
Then I concatenate all dumps in order:
cat sleep.* > sleep-bin
But the file is very much different than /bin/sleep
It seems like to be relocation table and other uninitialized data, so is it impossible to fix a memory dump? (Make it runnable)
Disclaimer: I'm a windows guy and don't know much about the linux process internals and ELF format, but I hope I can help!
I would say it's definitly possible to do, but not for ALL programs. The OS loader loads all parts of the executable into memory that are within a well defined place in the file. For example some uninstallers store data that is appended to the executable file - this will not be loaded to memory so this will be information you cannot restore just by dumping memory.
Another problem is that the information written by the OS is free to be modified by anything on the system that has the right to do so. No normal program would do something like that though.
The starting point would be to find the ELF headers of your executable module in memory and dump that. It will contain pretty much all the data you need for your task. For example:
the number of sections and where they are in memory and in the file
how sections in the file are mapped to sections in virtual memory (they usually have different base addresses and sizes!)
where the relocation data is
For the relocs you would have to read up on that how the reloc data is stored and processed with the ELF format. Once you know that it should be pretty easy to undo the changes for your dump.

Redis Capped Sorted Set, List, or Queue?

Has anyone implemented a capped data-structure of any kind in Redis? I'm working on building something like a news feed. The feed will wind up being manipulated and read from very frequently, and holding it in a sorted set in Redis would be cheap and perfect for my use case. The only issue is I only ever need n items per feed, and I'm worried about memory overflow, so I'd like to ensure each feed never gets above n items. It seems pretty trivial to make a capped sorted collection in Redis with Lua:
redis-cli EVAL "$(cat update_feed.lua)" 1 feeds:some_feed "thing_to_add", n
Where update_feed.lua looks something like (without testing it):
redis.call('ZADD', KEYS[1], os.time(), ARGV[1])
local num = redis.call('ZCARD', KEYS[1])
if num > ARGV[2]:
redis.call('ZREMRANGEBYRANK', KEYS[1], -n, -inf)
That's not bad at all, and pretty cheap, but it seems like such a basic thing that could be doable much more cheaply by instantiating the sorted set with only n buckets to begin with. I can't find a way to do that in redis, so I guess my question is: did I miss something, and if I didn't, why is there no structure for this in redis, even if it just ran the basic Lua script I described, it seems like it would be a typical enough use-case that it ought to be implemented as an option for redis data structures?
You can use LTRIM if it is a list.
Excerpt from the documentation.
LPUSH mylist someelement
LTRIM mylist 0 99
This pair of commands will push a new element on the list, while making sure that the list will not grow larger than 100 elements. This is very useful when using Redis to store logs for example. It is important to note that when used in this way LTRIM is an O(1) operation because in the average case just one element is removed from the tail of the list.
I use sorted sets myself for this. I too thought about using lists, but then I found that manipulating the INSIDE of a list is fairly expensive -- O(n) -- while manipulating the inside of a sorted set is O(log n).
That's what sealed the deal for me--will you ever be manipulating the inside of the set? If so, stick with sorted sets and just flush the oldest whenever you have to, just like you were thinking.

Resources