What does .text.unlikely mean in ELF object files? - symbols

In my objdump -t output, I see the following two lines:
00000000000004d2 l F .text.unlikely 00000000000000ec function-signature-goes-here [clone .cold.427]
and
00000000000018e0 g F .text 0000000000000690 function-signature-goes-here
I know l means local and g means global. I also know that .text is a section, or a type of section, in an object file, containing compiled program instructions. But what is .text.unlikely? Assuming it's a different section (or type-of-section) from .text - what's the difference?

In my GCC v5.4.0 manpage, I found the following switch:
-freorder-functions
which says:
Reorder functions in the object file in order to improve code
locality. This is implemented by using special subsections
".text.hot" for most frequently executed functions and
".text.unlikely" for unlikely executed functions. Reordering is done
by the linker so object file format must support named sections and
linker must place them in a reasonable way.
Also profile feedback must be available to make this option effective.
See -fprofile-arcs for details.
Enabled at levels -O2, -O3, -Os.
Looks like the compiler was run with optimization flags or that switch for this binary, and functions are organized in subsections to optimize spatial locality.

Related

Do we have an easy way to see elaborated core terms in agda?

I'm studying how dependent pattern matching works in agda.
If I can see elaborated core terms(https://github.com/agda/agda/blob/master/src/full/Agda/Syntax/Internal.hs#L202) of arbitrary source code of .agda file,
it will be really helpful for me.
However, agda cli seems not to offer any options for this usage. Is there any?
There's three options you could try depending on how much detail you want, though none of them are perfect:
If all you want is to see what implicit arguments Agda has inserted, you can enable the flags --show-implicit and --show-irrelevant, create a new hole with the term you want to inspect by adding _ = {! yourTerm !} at the bottom of the file, reload the file with C-c C-l, and then press C-u C-c C-m with the cursor inside the hole. [Writing this out made me realize there ought to be a simpler way to do this.]
If you want to inspect and possibly manipulate the full AST of an Agda term, you can do so using the reflection API (https://agda.readthedocs.io/en/v2.6.2.1/language/reflection.html). In particular, you can get the reflected syntax of an arbitrary Agda term by using the quoteTerm primitive.
Finally, if you need more information you can look in the source code of Agda itself and enable the debug flags for printing the information you want. Note that there is no guarantee that this debug information will be useful or even readable, as it is intended for use by the developers. With that being said, you could for example print the case tree generated from a definition by pattern matching by adding {-# OPTIONS -v tc.cc:12 #-} at the top of your file. In Emacs, this debug information will end up in a separate buffer titled *Agda debug* (which you'll have to open manually after loading the .agda file).

What can i get from the lua bytecode retrivied via string.dump()?

I’m using lua + luajit 2.0.4 and I’m wondering - Is it possible to restore the original parts of the code from the dumps of lua functions?
function a(l)
if l > 3 then
print(l*l)
end
end
local b = string.dump(a)
In this example, I am doing the string.dump of the 'a' function, and here I come to the questions like:
Is it possible to write this dump into a .txt file?
Is it possible to get the original names of functions, variables, and upvalues?
Is it possible to get strings, numbers, tables?
Is it possible to restore it to the full code, and if not, is it possible to get a disassembled listing?
"Yes" to all questions with a couple of caveats. For (1), make sure that "b" is used as part of the "mode" parameter in io.open on Windows, as the output of string.dump will have some binary content. For (2), it's only true when string.dump is used without the strip option, which was added in LuaJIT:
string.dump(f [,strip])
An extra argument has been added to string.dump(). If set to true,
'stripped' bytecode without debug information is generated. This
speeds up later bytecode loading and reduces memory usage.
For (4), I found this document to be very useful: http://files.catwell.info/misc/mirror/lua-5.2-bytecode-vm-dirk-laurie/lua52vm.html (it's for Lua 5.2, but most of the content applies to LuaJIT as well); it also include a section on the difference between full and stripped bytecode that may answer some of your questions.

GCC ppc64 aligned functions

I'm using GCC for make some powerpc64 executable, but sometimes between functions i have the following mistakes: Screenshot
Powerpc instructions format are still in 4 bytes, i tried some gcc commands (-fno-align-functions) but the compiler still fill bytes between functions.
I want my functions start directly after the end of the previous functions, without any values/zero filled (in the case of the screenshots the functions should start at 0x124).
Thanks.
The PPC64 ABI specifies a traceback table appended to functions. The zeroes may be due to the traceback table and not related to alignment. Try using the
-mtraceback=no command line option.
In addition to the traceback table issue noted in the previous answer, functions are normally aligned on a 16-byte boundary. This is important for various reasons, including so the compiler can align hot loops on a 16-byte boundary for improved icache performance. Assembly code from GCC will have a directive like:
.p2align 4,,15
before each function definition to enforce this. So even without the traceback table your function will not start at address 0x124 without more effort.
This behavior can be overridden using -fno-align-functions, or using optimization level -Os (optimize for size). I've tried both methods, and they both remove the .p2align directive. Using -fno-align-functions is preferable unless you really want smaller and potentially slower code.
(If you are compiling with -O0 or -O1, you won't see the directive either, but we do not recommend compiling at such low optimization levels for either size or speed.)

Reading COBOL code with .NET to generate a call graph

I am working on a project to automate COBOL to generate a class diagram. I am developing using a .NET console application. I need help tracking down the procedure name where the perform statement in used in the below example.
**Z-POST-COPYRIGHT.
move 0 to RETURN-CODE
perform Z-WRITE-FILE**
How do I track the procedure name 'Z-Post-COPYRIGHT' where the procedure 'Z-write-file' is called? The only idea I could think of in terms of COBOL is through indentation as the procedure names are always indented. Ideally in the database, the code should track the procedure name after the word 'perform' and procedure under which it is called (in this case it is Z-POST-COPYRIGHT).
I assume you want to do this "on your own" without external tools (a faster approach can be found at the end).
You first have to "know" your source:
which compiler was it compiled with (get a manual for this compiler)
which options were used
Then you have to preparse the source:
include copybooks (doing the given REPLACING rules if any)
if the source is in free-form reference format: concatenate contents of last line and current line if you find a - in column 7
check for REPLACE and change the result accordingly
remove all comments (maybe only * and \ in column 7 in fixed-form reference format or similar (extensions like "variable" format / "terminal" format", ... exist, maybe only inline comments - when in free-form reference-format, otherwise maybe inline comments *> or compiler specific extensions like |) - depending on the further re-engineering you want to do it could be a good idea to extract them and store them at least with a line number reference
The you finally can track the procedure name with the following rule:
go backwards to the last separator period (there are more rules but the rule "at least one line break, another period, a space a comma or a semicolon" [I've never seen the last two in real code but it is possible" should be enough)
check if there is only one word between this separator period and the next
if this word is no reserved COBOL word (this depends on your compiler) it is very likely a procedure name
Start from here and check the output, then fine grade the rule with actual false positives or missing entries.
If you want to do more than only extract the procedure-names for PERFORM and GO TO (you should at least check the sources for PERFROM ... THRU) then this can get to a lot of work...
Faster approach with external tools:
run a COBOL compiler on the complete sources and tell it to do the preparsing only - this way you have the big second point solved already
if you have the option: tell the compiler or an external tool to create a symbol table / cross reference - this will tell you in which line a procedure is and its name (you can simply find the correct procedure by comparing the line)
Just a note: You may want to check GnuCOBOL (formerly OpenCOBOL) for the preparsing and/or generation of symbol tables/cross-reference and/or printcbl for a completely external tool doing preparsing and/or cobxref for a complete cross reference generation.

meaning of llvm[n] when compiling llvm, where n is an integer

I'm compiling LLVM as well as clang. I noticed that the output of compilation has llvm[1]: or llvm[2]: or llvm[3]: prefixed to each line. What do those integers in brackets mean?
Apparently, it's not connected to the number of the compilation job (can be easily checked via make -j 1). The autoconf-based build system indicates the "level" of the makefile inside the source tree). To be prices, it's a value of make's MAKELEVEL variable.
The currently accepted answer is not correct. Furthermore, this is really a GNU Make question, not a LLVM question.
What you're seeing is the current value of the MAKELEVEL variable echoed by make to the command line. This value is set as a result of recursive execution. From the GNU make manual:
As a special feature, the variable MAKELEVEL is changed when it is passed down from level to level. This variable’s value is a string which is the depth of the level as a decimal number. The value is ‘0’ for the top-level make; ‘1’ for a sub-make, ‘2’ for a sub-sub-make, and so on. The incrementation happens when make sets up the environment for a recipe.
If you have a copy of the GNU Make source code on hand, you can see your output message being generated in output.c with the void message(...) function. In GNU make v4.2, this happens on line 626. Specifically the program argument is set to the string "llvm" and the makelevel argument is set as noted above.
Since it was erroneously brought up, it is not the number of the compilation job. The -j [jobs] or --jobs[=jobs] options enable parallel execution of up to jobs number of recipes simultaneously. If -j or --jobs is selected, but jobs is not set, GNU Make attempts to execute as many recipes simultaneously as possible. See this section of the GNU Make manual for more information.
It is possible to have recursive execution without parallel execution, and parallel execution without recursive execution. This is the main reason that the currently accepted answer is not correct.
It's the number of the compilation job (make -j). Helpful to trace compilation errors.

Resources