I write the following C code where variable X is being assigned twice:
int main()
{
int x;
x = 10;
x = 20;
return 0;
}
Compile and generate IR representation using the following command
clang -emit-llvm -c ssa.c
IR generated
; Function Attrs: nounwind uwtable
define i32 #main() #0 {
entry:
%retval = alloca i32, align 4
%x = alloca i32, align 4
store i32 0, i32* %retval
store i32 10, i32* %x, align 4
store i32 20, i32* %x, align 4
ret i32 0
}
If my understanding of SSA format is correct, we should in this example see x1 and x2 as two LLVM IR variables generated and assigned two values 10 and 20 respectively. Is there some specific option we should compile with to get SSA IR representation or my understanding of IR representation is incorrect? Please advise.
EDIT: as suggested in one answer, using -mem2reg optimization pass gives me the following output
clang -c -emit-llvm ssa.c -o ssa.bc
opt -mem2reg ssa.bc -o ssa.opt.bc
llvm-dis ssa.opt.bc
cat ssa.opt.ll
Resultant IR generated
; Function Attrs: nounwind uwtable
define i32 #main() #0 {
entry:
ret i32 0
}
it looks like the entire x assignment got optimized using mem2reg optimization. Any other way to generate and retain different x values?
LLVM passes mem2reg and reg2mem convert code to/from SSA form. You can run them using opt tool.
Related
I am a beginner in llvm ir, and now I am doing a llvm transform pass to add some code in C program, but I have some trouble in it.
The source C code is
void foo(int i) {
a[i] = -1;
}
and its llvm ir is:
define void #foo(i32 noundef %0) #0 {
%2 = alloca i32, align 4
store i32 %0, ptr %2, align 4
%3 = load i32, ptr %2, align 4
%4 = sext i32 %3 to i64
%5 = getelementptr inbounds [10 x i32], ptr #a, i64 0, i64 %4
store i32 -1, ptr %5, align 4
ret void
}
Now I want to get the variable i, and build an if then instruction, which looks like:
void foo(int i) {
a[i] = -1;
if (i > 5)
// do some things
}
My question is that how can I get the variable i and build an if statement in llvm ir? now I just know use llvm built-in function SplitBlockAndInsertIfThen to create an if…then… statement, but how can I get i first? Waiting for someone to answer my question, or give an idea. Sincerely!
I am having a problem where Clang is seemingly generating unnecessary bitcasts (from f32 -> i32 -> f32. The following piece is the generated IR (with stripped out irrelevant pieces. I tried highlighting the relevant lines but that does not work in a code block.
The problem is the bitcast defining %5, followed by the duplicated phi nodes (%6 and %7), and a bitcast store which is the last line. I don't see a reason why this datapath is bitcasted into i32, it is not being used anywhere else.
The corresponding c code contains only float data types, some structs with nested arrays as data types, if elseif else constructs with some floating point constants (it is generated code). No clue as on why.
(ccode: https://pastebin.com/c0gYkwhF, llvm ir: https://pastebin.com/NSyhSkUa)
tldr;
Why is the bitcast to i32 being generated
Can I prevent this somehow (don't want to deal with i32 datatype)
; Function Attrs: nofree norecurse nounwind uwtable
define dso_local void #CurrentControl_step() local_unnamed_addr #0 {
entry:
; (...)
%mul3 = fmul fast float %4, 0x3FEFFFEF80000000
%add4 = fadd fast float %3, %mul3
%cmp = fcmp fast ogt float %add4, 0x3FEF400000000000
br i1 %cmp, label %if.then, label %if.else
; (...)
if.else7: ; preds = %if.else
store float %add4, float* getelementptr inbounds (%struct.DW_CurrentControl_T, %struct.DW_CurrentControl_T* #CurrentControl_DW, i64 0, i32 1, i64 0), align 4, !tbaa !2
%5 = bitcast float %add4 to i32
br label %if.end8
if.end8: ; preds = %if.then6, %if.else7, %if.then
%6 = phi i32 [ -1082523648, %if.then6 ], [ %5, %if.else7 ], [ 1064960000, %if.then ]
%7 = phi float [ 0xBFEF400000000000, %if.then6 ], [ %add4, %if.else7 ], [ 0x3FEF400000000000, %if.then ]
; (...)
%9 = fsub fast float %7, %mul9
; (...)
store i32 %6, i32* bitcast (float* getelementptr inbounds (%struct.DW_CurrentControl_T, %struct.DW_CurrentControl_T* #CurrentControl_DW, i64 0, i32 2, i64 0) to i32*), align 4, !tbaa !2
; (...)
EDIT:
I have a new minimal example generating a seemingly unnecessary bitcast:
typedef struct {
float nestedArray[2];
} Foo;
Foo s;
float testVar;
void test(void) {
testVar = s.nestedArray[0];
}
which generates:
%struct.Foo = type { [2 x float] }
#s = dso_local local_unnamed_addr global %struct.Foo zeroinitializer, align 4
#var = dso_local local_unnamed_addr global float 0.000000e+00, align 4
; Function Attrs: nofree norecurse nounwind uwtable
define dso_local void #test() local_unnamed_addr #0 {
entry:
%0 = load i32, i32* bitcast (%struct.Foo* #s to i32*), align 4, !tbaa !2
store i32 %0, i32* bitcast (float* #var to i32*), align 4, !tbaa !2
ret void
}
Bitcast was caused by: doc
// Try to canonicalize loads which are only ever stored to operate over
// integers instead of any other type. We only do this when the loaded type
// is sized and has a size exactly the same as its store size and the store
// size is a legal integer type.
// Do not perform canonicalization if minmax pattern is found (to avoid
// infinite loop).
I just compiled a small piece of C code using clang 3.7:
typedef unsigned char char4 __attribute__ ((vector_size (4)));
char4 f1 (char4 v)
{
return v / 2;
}
That functions compile to (I removed debuginfo):
define <4 x i8> #f1(<4 x i8> %v) {
entry:
%div = udiv <4 x i8> %v, bitcast (<1 x i32> <i32 2> to <4 x i8>)
ret <4 x i8> %div
}
According to llvm documentation, bitcast operation doesn’t change bits, meaning to <4 x i8> should yield <2, 0, 0, 0> (or <0, 0, 0, 2>). Am I right?
Therefore, I’ll get Division by Zero exception.
The code I wrote intended to make a broadcast (or splat), and not a bitcast.
Could someone please explain what’s happening?
Thanks!
actually it looks like a bug in clang:
https://llvm.org/bugs/show_bug.cgi?id=27085
this input code should either not compile, or generate a warning, or compile to a vector splat
For some reason having an empty while loop in a release build hangs, while having it in the debug build works fine. This example works in debug but hangs in release:
//Wait for stream to open
while (!_isReadyForData);
This is the solution I came up with in order to get it to work in release:
//Wait for stream to open
while (!_isReadyForData)
{
//For some reason in release mode, this is needed
sleep(.5);
}
I am just curious why I would need to add something in the loop block of code.
The reason is of course due to compilers optimizations, as already noted in the comments.
Remembering that Objective-C is built on top of C, I put together a simple C example with different levels of optimizations and here's the result.
Original code
int main(int argc, char const *argv[]) {
char _isReadyForData = 0;
while (!_isReadyForData);
return 0;
}
LLVM IR with no optimizations (-O0)
define i32 #main(i32 %argc, i8** %argv) #0 {
entry:
%retval = alloca i32, align 4
%argc.addr = alloca i32, align 4
%argv.addr = alloca i8**, align 8
%_isReadyForData = alloca i8, align 1
store i32 0, i32* %retval
store i32 %argc, i32* %argc.addr, align 4
store i8** %argv, i8*** %argv.addr, align 8
store i8 0, i8* %_isReadyForData, align 1
br label %while.cond
while.cond: ; preds = %while.body, %entry
%0 = load i8* %_isReadyForData, align 1
%tobool = icmp ne i8 %0, 0
%lnot = xor i1 %tobool, true
br i1 %lnot, label %while.body, label %while.end
while.body: ; preds = %while.cond
br label %while.cond
while.end: ; preds = %while.cond
ret i32 0
}
LLVM IR with level 1 optimizations (-O1)
define i32 #main(i32 %argc, i8** nocapture %argv) #0 {
entry:
br label %while.cond
while.cond: ; preds = %while.cond, %entry
br label %while.cond
}
As you can see, the compiler produces an infinite loop when optimizing, since the local variable _isReadyForData is useless in that context and therefore is removed.
As suggested by #faffaffaff, using the volatile keyword on _isReadyForData may solve the issue.
LLVM IR with level 1 optimizations (-O1) with volatile keyword
define i32 #main(i32 %argc, i8** nocapture %argv) #0 {
entry:
%_isReadyForData = alloca i8, align 1
store volatile i8 0, i8* %_isReadyForData, align 1
br label %while.cond
while.cond: ; preds = %while.cond, %entry
%_isReadyForData.0.load1 = load volatile i8* %_isReadyForData, align 1
%lnot = icmp eq i8 %_isReadyForData.0.load1, 0
br i1 %lnot, label %while.cond, label %while.end
while.end: ; preds = %while.cond
ret i32 0
}
But I definitely agree with #rmaddy in saying that you'd better change the flow of your program and use driven logic, instead of patching what you have already.
Please note: I'm just trying to learn. Please do not yell at me for toying with assembly.
I have the following method:
uint32 test(int16 a, int16 b)
{
return ( a + b ) & 0xffff;
}
I created a .s file based on details I found here.
My .s file contains the following:
.macro BEGIN_FUNCTION
.align 2 // Align the function code to a 4-byte (2^n) word boundary.
.arm // Use ARM instructions instead of Thumb.
.globl _$0 // Make the function globally accessible.
.no_dead_strip _$0 // Stop the optimizer from ignoring this function!
.private_extern _$0
_$0: // Declare the function.
.endmacro
.macro END_FUNCTION
bx lr // Jump back to the caller.
.endmacro
BEGIN_FUNCTION addFunction
add r0, r0, r1 // Return the sum of the first 2 function parameters
END_FUNCTION
BEGIN_FUNCTION addAndFunction
add r0, r0, r1 // Return the sum of the first 2 function parameters
ands r0, r0, r2 // Ands the result of r0 and the third parameter passed
END_FUNCTION
So if I call the following:
addFunction(10,20)
I get what I would expect. But then if I try
int addOne = addFunction(0xffff,0xffff); // Result = -2
int addTwo = 0xffff + 0xffff; // Result = 131070
My addOne does not end up being the same value as my add two. Any ideas on what I am doing wrong here?
When you pass the int16_t parameter 0xffff to addFunction the compiler sign-extends it to fit in the 32-bit register (as it's a signed integer) making it 0xffffffff. You add two of these together to get 0xfffffffe and return it. When you do 0xffff + 0xffff both constants are already 32-bits and so there's no need to sign extend, the result is 0x0001fffe.
What you need to understand is that when you're using a signed integer and passing the 16-bit value 0xffff you're actually passing in -1, so it's no surprise that -1 + -1 = -2. It's also no suprise that 0xffff + 0xffff = 0xfffffffe.
If you change the (not shown) C declaration of addFunction to take two unsigned ints, then you'll get the result you desire (the compiler won't do the sign extension).
Your assembly code assumes a third passed parameter (in R2) but when you call the
function you are only passing 2 parameters. I think the contents of R2 could be anything in your addFunction.