How to get the Objectivc-C selector name in LLVM IR? - ios

I am a beginner to LLVM. I am trying to develop a LLVM pass. This pass will find all the comparisons "self.currentGameID == 2007", and replace it with "true".
I success to find the comparison, the right operand "2007" is easy to confirm, but when I check the function name whether is "currentGameID", I only get the "objc_msgsend".
Here is the code:
bool handleComp(ICmpInst *cmpInst) {
if (!cmpInst->hasOneUse()) {
return false;
}
APInt CmpC = cast<ConstantInt>(cmpInst->getOperand(1))->getValue();
if (!CmpC) {
return false;
}
auto constIntValue = CmpC.getSExtValue();
if (constIntValue == 2007) {
auto BasicBlock = cmpInst->getParent();
auto preInstruction = cmpInst->getPrevNode();
if (isa<CallInst>(preInstruction)) {
CallInst *ci = cast<CallInst>(preInstruction);
Function *function = ci->getCalledFunction();
if (function) {
StringRef name = function->getName();
errs().write_escaped(name) << '\n';
} else {
Function *voidFunc = dyn_cast<Function>(ci->getCalledOperand()->stripPointerCasts());
StringRef name = voidFunc->getName();
errs().write_escaped(name) << '\n';
}
}
return true;
}
return false;
}
I get voidFunc name , it is "objc_msgsend"!
Is there anyone can tell me how to get the true selector name "currentGameID"?

Accessing a property like self.currentGameID is the same as [self currentGameID]. That is, they both produce the same Objective-C runtime call of objc_msgSend(self, #selector(currentGameID)).
The Objective-C code if (self.currentGameID == 2007) will produce (roughly) the following IR:
%1 = load i8*, i8** #OBJC_SELECTOR_REFERENCES_.0
%2 = call i64 bitcast (i8* (i8*, i8*, ...)* #objc_msgSend to i64 (i8*, i8*)*)(i8* %0, i8* %1)
%3 = icmp eq i64 %2, 2007
So to get the actual selector you can use ci->getArgOperand(i) where i is either 1 or 2 depending on which version of LLVM you're using.

Related

Looping over an integer range in Zig

Is a while-loop like this the idiomatic way to loop over an integer range in Zig?
var i: i32 = 5;
while (i<10): (i+=1) {
std.debug.print("{}\n", .{i});
}
I first tried the python-like
for (5..10) |i| {
// ....
but that doesn't work.
zig has no concept of integer range loops but there's a hack by nektro which create a random []void slice, so you can iterate with for loop
const std = #import("std");
fn range(len: usize) []const void {
return #as([*]void, undefined)[0..len];
}
for (range(10)) |_, i| {
std.debug.print("{d}\n", .{i});
}

The LLVM HelloWorld pass from the tutorial does not run if the IR is produced by clang

I am very new to LLVM/clang and trying to write my custom LLVM pass using the new pass manager.
My first step was to use the HelloWorld pass from the official documentation.
It works fine when I am using the file a.ll provided by the documentation with the command ./bin/opt a.ll -passes=helloworld -S
foo
bar
; ModuleID = 'a.ll'
source_filename = "a.ll"
define i32 #foo() {
%a = add i32 2, 3
ret i32 %a
}
define void #bar() {
ret void
}
Now I have created a C file a2.c with:
void test(){
}
And generate the IR with ./bin/clang -S -emit-llvm a2.c
Running the previous opt command on a2.ll gives
; ModuleID = 'a2.ll'
source_filename = "a2.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
; Function Attrs: noinline nounwind optnone uwtable
define dso_local void #test() #0 {
entry:
ret void
}
attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 13.0.0 (https://github.com/llvm/llvm-project.git 8e7df996e3054cc174b91bc103057747c8349c06)"}
I can not see the expected "test" at the beginning of the file, therefore my pass is not run by the PassManager.
Any ideas on what is wrong with my pass?
Thanks for your help.
[EDIT]
Using clang flags -O1 or more solves the issue but I do not understand why.
clang with -O0 adds optnone attribute that disables any further processing of the IR by transformation passes.

How do I pass a "C" string from a "C" routine to a GO function (and convert it to a GO string?)

This must be something really silly and basic, but the cgo docs (and google fu) have left me stranded. Here's what I am trying to do: I want a GO function to call a "C" function using 'import "C"'. Said "C" function needs to store the address of a "C" string (malloc or constant - neither has worked for me) into an argument passed to it as *C.char. The GO function then needs to convert this to a GO string. It actually does work, except I get this:
panic: runtime error: cgo argument has Go pointer to Go pointer
If I run with GODEBUG=cgocheck=0, it all works fine. If I leave as default:
strptr = 4e1cbf ('this is a C string!')
main: yylex returned token 1
yylval.tstrptr 4e1cbf
stringval token "this is a C string!"
The problematic line seems to be:
yylval.stringval = C.GoString(yylval.tstrptr)
What little I can find about C.GoString, it left me with the impression that it allocates a GO string, and fills it in from the "C" string provided, but that seems to not be the case, or why am I getting a complaint about 'Go pointer to Go pointer'? I've tried a number of other approaches, like having the "C" function malloc the buffer and the GO function do C.free() on it. Nothing has worked (where worked == avoiding this runtime panic).
The GO source:
package main
import (
"fmt"
"unsafe"
)
// #include <stdio.h>
// int yylex (void * foo, void *tp);
import "C"
type foo_t struct {
i int32
s string
}
var foo foo_t
func main() {
var retval int
var s string
var tp *C.char
for i := 0; i < 2; i++ {
retval = int(C.yylex(unsafe.Pointer(&foo), unsafe.Pointer(&tp)))
fmt.Printf("main: yylex returned %d\n", retval)
fmt.Printf("tp = %x\n", tp)
if retval == 0 {
s = C.GoString(tp)
fmt.Printf("foo.i = %d s = %q\n", foo.i, s)
} else {
foo.s = C.GoString(tp)
fmt.Printf("foo.i = %d foo.s = %q\n", foo.i, foo.s)
}
}
}
The "C" source
#include <stdio.h>
int yylex (int * foo, char ** tp)
{
static num;
*foo = 666;
*tp = "this is a C string!";
printf ("strptr = %x ('%s')\n", *tp, *tp);
return (num++);
}
What's interesting is that if the GO func stores into foo.s first, the 2nd call to yylex bombs with the panic. If I do s and then foo.s (depending on whether I check retval as 0 or non-zero), it doesn't fail, but I'm guessing that is because the GO function exits right away and there are no subsequent calls to yylex.

Is 'none' one of basic types in Lua?

The basic types defined in Lua as below :
/*
** basic types
*/
#define LUA_TNONE (-1)
#define LUA_TNIL 0
#define LUA_TBOOLEAN 1
#define LUA_TLIGHTUSERDATA 2
#define LUA_TNUMBER 3
#define LUA_TSTRING 4
#define LUA_TTABLE 5
#define LUA_TFUNCTION 6
#define LUA_TUSERDATA 7
#define LUA_TTHREAD 8
#define LUA_NUMTAGS 9
As Lua Document says that there only 8 basic types in Lua. However, there gets 10. I know LUA_TLIGHTUSERDATA and LUA_TUSERDATA could finnally represented as userdata, but what about LUA_TNONE? And what is the differences of none and nil?
As was already mentioned in the comments, none is used in the C API to check whether there is no value. Consider the following script:
function foo(arg)
print(arg)
end
foo(nil) --> nil
foo() --> nil
In Lua you can use select('#', ...) to get the number of parameters passed to foo and using C API you can check if the user supplied no argument at all (using lua_isnone). Consider the following small C library, which works like type, except that it can recognize, if no argument was given:
#include <stdio.h>
#include <lua.h>
#include <lauxlib.h>
static int which_type(lua_State* L)
{
// at first, we start with the check for no argument
// if this is false, there has to be at least one argument
if(lua_isnone(L, 1))
{
puts("none");
}
// now iterate through all arguments and print their type
int n = lua_gettop(L);
for(int i = 1; i <= n; ++i)
{
if(lua_isboolean(L, i))
{
puts("boolean");
}
else if(lua_istable(L, i))
{
puts("table");
}
else if(lua_isstring(L, i) && !lua_isnumber(L, i))
{
puts("string");
}
else if(lua_isnumber(L, i))
{
puts("number");
}
else if(lua_isfunction(L, i))
{
puts("function");
}
else if(lua_isnil(L, i))
{
puts("nil");
}
else if(lua_isthread(L, i))
{
puts("thread");
}
else if(lua_isuserdata(L, i))
{
puts("userdata");
}
}
return 0;
}
static const struct luaL_Reg testclib_functions[] = {
{ "type", which_type },
{ NULL, NULL }
};
int luaopen_testclib(lua_State* L)
{
luaL_newlib(L, testclib_functions);
return 1;
}
Compile this with something like gcc -shared -fPIC -o testclib.so testclib.c. In lua, we now load the library and use the function type:
local p = require "testclib"
p.type(nil) --> nil
p.type({}) --> table
p.type("foo")) --> string
-- now call it without any arguments
p.type()) --> none, not nil
--type() -- error: bad argument #1 to 'type' (value expected)
Note that you can't get 'none' and someother type from one call (while it is possible to receive multiple types by using multiple arguments, e.g. p.type("foo", 42)). This is quite logical, since it would be a syntactic error to use something like this:
p.type(, 42) -- error
One use of this can be seen in the print function, where print(something) prints the value (even it is not valid, e.g. nil), where print() prints a newline.

Retrieve LHS/RHS value of operator

I'm looking to do something similar to this how to get integer variable name and its value from Expr* in clang using the RecursiveASTVisitor
The goal is to first retrieve all assignment operations then perform my own checks on them, to do taint analysis.
I've overridden the VisitBinaryOperator as such
bool VisitBinaryOperator (BinaryOperator *bOp) {
if ( !bOP->isAssignmentOp() ) {
return true;
}
Expr *LHSexpr = bOp->getLHS();
Expr *RHSexpr = bOp->getRHS();
LHSexpr->dump();
RHSexpr->dump();
}
This RecursiveASTVisitor is being run on Objective C codes, so I do not know what the LHS or RHS type will evaluate to (could even be a function on the RHS?)
Would it be possible to get the text representation of what is on the LHS/RHS out from clang in order to perform regex expression on them??
Sorry, I found something similar that works for this particular case.
Solution:
bool VisitBinaryOperator (BinaryOperator *bOp) {
if ( !bOP->isAssignmentOp() ) {
return true;
}
Expr *LHSexpr = bOp->getLHS();
Expr *RHSexpr = bOp->getRHS();
std::string LHS_string = convertExpressionToString(LHSexpr);
std::string RHS_string = convertExpressionToString(RHSexpr);
return true;
}
std::string convertExpressionToString(Expr *E) {
SourceManager &SM = Context->getSourceManager();
clang::LangOptions lopt;
SourceLocation startLoc = E->getLocStart();
SourceLocation _endLoc = E->getLocEnd();
SourceLocation endLoc = clang::Lexer::getLocForEndOfToken(_endLoc, 0, SM, lopt);
return std::string(SM.getCharacterData(startLoc), SM.getCharacterData(endLoc) - SM.getCharacterData(startLoc));
}
Only thing I'm not very sure about is why _endLoc is required to compute endLoc and how is the Lexer actually working.
EDIT:
Link to the post I found help Getting the source behind clang's AST

Resources