Short question: how to compile with clang with no code optimization? -O0 is not working.
Long question:
I'm learning code optimization and LLVM in particular. I'm writing small examples, compiling them and then running just one optimization at a time, to analyze what it changes. For example, to test Dead Code Elimination, I tried this:
int main() {
int a = 20 + 30;
int b = 25; /* Assignment to dead variable */
int c;
c = a << 2;
return c;
b = 24; /* Unreachable code */
return 0;
}
However, when I compile it with
clang -S -O0 -emit-llvm foo.c
The last two lines of my C code do not show up in the IR code (below). Also, the 20 + 30 is already being calculated to 50. So there's some optimization going on here, even though I'm using -O0.
; ModuleID = 'hello.c'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
; Function Attrs: nounwind uwtable
define i32 #main() #0 {
entry:
%retval = alloca i32, align 4
%a = alloca i32, align 4
%b = alloca i32, align 4
%c = alloca i32, align 4
store i32 0, i32* %retval
store i32 50, i32* %a, align 4
store i32 25, i32* %b, align 4
%0 = load i32* %a, align 4
%shl = shl i32 %0, 2
store i32 %shl, i32* %c, align 4
%1 = load i32* %c, align 4
ret i32 %1
}
attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"clang version 3.4 (trunk 192936)"}
Related
Here's a simple C file with an enum definition and a main function:
enum days {MON, TUE, WED, THU};
int main() {
enum days d;
d = WED;
return 0;
}
It transpiles to the following LLVM IR:
define dso_local i32 #main() #0 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
store i32 0, i32* %1, align 4
store i32 2, i32* %2, align 4
ret i32 0
}
%2 is evidently the d variable, which gets 2 assigned to it. What does %1 correspond to if zero is returned directly?
This %1 register was generated by clang to handle multiple return statements in a function. Imagine you were writing a function to compute an integer's factorial. Instead of this
int factorial(int n){
int result;
if(n < 2)
result = 1;
else{
result = n * factorial(n-1);
}
return result;
}
You'd probably do this
int factorial(int n){
if(n < 2)
return 1;
return n * factorial(n-1);
}
Why? Because Clang will insert that result variable that holds the return value for you. Yay. That's the reason for that %1 variable. Look at the ir for a slightly modified version of your code.
Modified code,
enum days {MON, TUE, WED, THU};
int main() {
enum days d;
d = WED;
if(d) return 1;
return 0;
}
IR,
define dso_local i32 #main() #0 !dbg !15 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
store i32 0, i32* %1, align 4
store i32 2, i32* %2, align 4, !dbg !22
%3 = load i32, i32* %2, align 4, !dbg !23
%4 = icmp ne i32 %3, 0, !dbg !23
br i1 %4, label %5, label %6, !dbg !25
5: ; preds = %0
store i32 1, i32* %1, align 4, !dbg !26
br label %7, !dbg !26
6: ; preds = %0
store i32 0, i32* %1, align 4, !dbg !27
br label %7, !dbg !27
7: ; preds = %6, %5
%8 = load i32, i32* %1, align 4, !dbg !28
ret i32 %8, !dbg !28
}
Now you see %1 making itself useful huh? Most functions with a single return statement will have this variable stripped by one of llvm's passes.
Why does this matter — what's the actual problem?
I think the deeper answer you're looking for might be: LLVM's architecture is based around fairly simple frontends and many passes. The frontends have to generate correct code, but it doesn't have to be good code. They can do the simplest thing that works.
In this case, Clang generates a couple of instructions that turn out not to be used for anything. That's generally not a problem, because some part of LLVM will get rid of superfluous instructions. Clang trusts that to happen. Clang doesn't need to avoid emitting dead code; its implementation may focus on correctness, simplicity, testability, etc.
Because Clang is done with syntax analysis but LLVM hasn't even started with optimization.
The Clang front end has generated IR (Intermediate Representation) and not machine code. Those variables are SSAs (Single Static Assignments); they haven't been bound to registers yet and actually after optimization, never will be because they are redundant.
That code is a somewhat literal representation of the source. It is what clang hands to LLVM for optimization. Basically, LLVM starts with that and optimizes from there. Indeed, for version 10 and x86_64, llc -O2 will eventually generate:
main: # #main
xor eax, eax
ret
When compiling this CPP file that using the list library by the command clang++ list-simple-test.cpp -c -emit-llvm:
// list1.cpp
#include <list>
using namespace std;
int main(int argc, char **argv)
{
int x = 1;
list<int*> alist;
alist.push_back(&x);
return x;
}
I notice that some functions, like _ZNSt8__detail15_List_node_base7_M_hookEPS0_ is generated without a function body:
; Function Attrs: nounwind
declare dso_local void #_ZNSt8__detail15_List_node_base7_M_hookEPS0_(%"struct.std::__detail::_List_node_base"*, %"struct.std::__detail::_List_node_base"*) #5
While most of the other functions are generated with a complete body, for example, like the function _ZNSt7__cxx1110_List_baseIPiSaIS1_EE11_M_inc_sizeEm below:
; Function Attrs: noinline nounwind optnone uwtable
define linkonce_odr dso_local void #_ZNSt7__cxx1110_List_baseIPiSaIS1_EE11_M_inc_sizeEm(%"class.std::__cxx11::_List_base"*, i64) #1 comdat align 2 {
%3 = alloca %"class.std::__cxx11::_List_base"*, align 8
%4 = alloca i64, align 8
store %"class.std::__cxx11::_List_base"* %0, %"class.std::__cxx11::_List_base"** %3, align 8
store i64 %1, i64* %4, align 8
%5 = load %"class.std::__cxx11::_List_base"*, %"class.std::__cxx11::_List_base"** %3, align 8
%6 = load i64, i64* %4, align 8
%7 = getelementptr inbounds %"class.std::__cxx11::_List_base", %"class.std::__cxx11::_List_base"* %5, i32 0, i32 0
%8 = getelementptr inbounds %"struct.std::__cxx11::_List_base<int *, std::allocator<int *> >::_List_impl", %"struct.std::__cxx11::_List_base<int *, std::allocator<int *> >::_List_impl"* %7, i32 0, i32 0
%9 = getelementptr inbounds %"struct.std::__detail::_List_node_header", %"struct.std::__detail::_List_node_header"* %8, i32 0, i32 1
%10 = load i64, i64* %9, align 8
%11 = add i64 %10, %6
store i64 %11, i64* %9, align 8
ret void
}
I understand that those functions are from libstdc++.so. But why does Clang generate the body for some functions, but not the other?
Does anybody know how to make Clang generate the body of _ZNSt8__detail15_List_node_base7_M_hookEPS0_ as well?
Thank you very much for reading my question! I'm writing a static analysis tool, which needs to analyze the body of _ZNSt8__detail15_List_node_base7_M_hookEPS0_ to obtain more precise result.
Most probably those other functions are coming from C++ templates.
When you declare a templated function, you have to provide its implementation in the header file in most cases. This way their code ends up in your own translation unit, and you see this code in your IR.
I temporarily found a workaround using the suggestion from #arrowd.
// generate list-simple-test.bc
clang++-list-simple-test.cpp -c -emit-llvm
// generate list.bc (list.cc is from the source code of libstdc++)
clang++ -emit-llvm list.cc
// combine list-simple-test.bc and list.bc
llvm-link list.bc list-simple-test.bc -o list-simple-final.bc
In the code about, list.cc can be downloaded from the gcc project
The final bitcode file list-simple-final.bc will contain the definition of _ZNSt8__detail15_List_node_base7_M_hookEPS0_, which is provided by list.cc
I am trying to generate LLVM bitcode and disassembled (.ll) code from a c source code. I want the instructions in the bitcode to have similar variable names as the source code.
Suppose I have a source code (sample.c):
int test(int a){
return a++;
}
The sample.ll contains :
; Function Attrs: noinline nounwind uwtable
define i32 #test(i32) #0 {
%2 = alloca i32, align 4
store i32 %0, i32* %2, align 4
%3 = load i32, i32* %2, align 4
%4 = add nsw i32 %3, 1
store i32 %4, i32* %2, align 4
ret i32 %3
}
Here, %0 resembles variable a in the source code.
How can I generate a sample.ll like this?
; Function Attrs: noinline nounwind
define i32 #test(i32 %a) #0 {
entry:
%a.addr = alloca i32, align 4
store i32 %a, i32* %a.addr, align 4
%0 = load i32, i32* %a.addr, align 4
%inc = add nsw i32 %0, 1
store i32 %inc, i32* %a.addr, align 4
ret i32 %0
}
Where %a resembles variable a in the source code.
NB: The clang version I am using is 6.0.0-1ubuntu2~16.04.1
I am using the command : clang -Xclang -disable-O0-optnone -O0 -emit-llvm -c sample.c -o sample.bc and then llvm-dis sample.bc
The thing you want to name isn't an Instruction, it's an Argument. The Argument constructor takes a Name argument, which is probably the intended way to set that. I've no idea why clang doesn't do that in your case. You can also call setName() later.
Making instructions have names follows the same pattern, provided that they don't have a void type. In your example, alloca and inc both have names. Making the load have a name would usually be done by passing a NameStr argument. setName() works on Instructions too (both Instruction and Argument inherit Value).
Please consider the following program:
int main() {
int test = 17;
return test;
}
Compile to LLVM_IR: clang++ -S -emit-llvm test.cpp
Looking at the IR, the function main is defined as so:
; Function Attrs: noinline norecurse nounwind optnone uwtable
define dso_local i32 #main() #0 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
store i32 0, i32* %1, align 4
store i32 17, i32* %2, align 4
%3 = load i32, i32* %2, align 4
ret i32 %3
}
We can see that %2 is the allocation of our test variable, loading 17 into it, and %3 uses that variable as the funcition's return value (in keep with the code as we wrote it). However, we see that %1 defines another int sized variable, and initializes it to 0, despite never using it. This extra variable is nowhere to be seen in the C++ source.
I should note that I see the same being generated when I compile using clang rather than clang++.
What is this extra variable?
I assume you are using an old version of clang. In the new version ( I mean v7.0 and later), value names are printed by default. But to be explicitly print, you might you -fno-discard-value-names. With this option you'll get the following IR:
define dso_local i32 #main() #0 {
entry:
%retval = alloca i32, align 4
%test = alloca i32, align 4
store i32 0, i32* %retval, align 4
store i32 17, i32* %test, align 4
%0 = load i32, i32* %test, align 4
ret i32 %0
}
Now it is quiet clear where store 0 comes from. In an unoptimized code, the compiler initializes the retval to 0.
I used clang to compile this code with -S -emit-llvm:
int sub2(int n) {
return n - 2
}
And this is the code it outputted:
; Function Attrs: nounwind
define i32 #_Z4sub2i(i32) #0 {
%2 = alloca i32, align 4
store i32 %0, i32* %2, align 4
%3 = load i32, i32* %2, align 4
%4 = sub nsw i32 %3, 2
ret i32 %4
}
However, I could write the same function as:
define i32 #sub2(i32) #0 {
%2 = sub i32 %0, 2
ret i32 %2
}
Why does it adds those instruction? I am not sure about it, but it seems it's copying the argument.
This is because you haven't run the mem2reg pass. The variables are considered to occupy space on the stack and are alloca'd.
If you try
opt --mem2reg filename.ll -S
you will see that you get something similar to what you expected.
mem2reg is also a part of O1, O2, and O3.
The mem2reg pass tries to convert "variables" into llvm temporaries. It does this only for those variables who address is not taken.