I have a huge C project and I want to create AST for a particular C file in that project with clang. I also need to find function definitions when an external function is called (so like when I see foo() I could find where foo was defined, not declared). I couldn't find any information on how I can link my project when I'm parsing with clang to get the AST so that clang can find function definitions. Any ideas?
Related
I am trying to parse c-clang index.h file with ClangSharp (just for testing purposes of ClangSharp parser on C#) and I found that it misses parsing of functions because of CINDEX_LINKAGE macro in the function declaration.
If I remove it, parser will correctly find FunctionDecl and parse it without errors.
I cannot understand how this macro preventing functions from being parsed. Does someone know how to workaround this?
Issue was in the #include line itself. By default, clang header includes setup to search in the directory on one level up, but clang itself by some reason does not understand such
include format.
I use the Clang python binding to extract the AST of c/c++ files. It works perfectly for a simple program I wrote. The problem is when I want to employ it for a big project like openssl. I can run clang for any single file of the project, but clang seems to miss some headers of the project, and just gives me the AST of a few functions of the file, not all of the functions. I set the include folder by -I, but still getting part of the functions.
This is my code:
import clang.cindex as cl
cl.Config.set_library_path(clang_lib_dir)
index = cl.Index.create()
lib = 'Path to include folder'
args = ['-I{}'.format(lib)]
translation_unit = index.parse(source_file, args=args)
my_get_info(translation_unit.cursor)
I receive too many header files not found errors.
UPDATE
I used Make to compile openssl by clang? I can pass -emit-ast option to clang to dump the ast of each file, but I cannot read it now by the clang python binding.
Any clues how I can save the the serialized representation of the translation units so that I will be able to read it by index.read()?
Thank you!
You would "simply" need to provide the right args. But be aware of two possible issues.
Different files may require different arguments for parsing. The easiest solution is to obtain compilation database and then extract compile commands from it. If you go this way be aware that you would need to filter out the arguments a bit and remove things like -c FooBar.cpp (potentially some others), otherwise you may get something like ASTReadError.
Another issue is that the include paths (-I ...) may be relative to the source directory. I.e., if a file main.cpp compiled from a directory /opt/project/ with -I include/path argument, then before calling index.parse(source_file, args=args) you need to step in (chdir) into the /opt/project, and when you are done you will probably need to go back to the original working directory. So the code may look like this (pseudocode):
cwd = getcwd()
chdir('/opt/project')
translation_unit = index.parse(source_file, args=args)
chdir(cwd)
I hope it helps.
I need to add headers to an already existing program by transforming it with LLVM and Clang.
I have used clang's rewriter to accomplish a similar thing in the changing function names and arguments, etc.
But the header files aren't present in clang's AST. I already know we need to use PPCallbacks (https://clang.llvm.org/doxygen/classclang_1_1PPCallbacks.html) but I am in dire need of some examples on how to make it work with the rewriter if at all possible.
Alternatively, adding a #include statement just before the first
using namespace <namespace>;
Also works. I would like to know an example of this as well.
Any help would be appreciated.
There is a bit of confusion in your question. You need to understand in details how the preprocessor works. Be aware that most of C++ compilation happens after the preprocessing phase (so most C++ static analyzers work after that phase).
In other words, the C++ specification (and also the C specification) defines first what is preprocessing, and then what is the syntax and the semantics of the preprocessed form.
In other words, when compiling foo.cc your compiler see the preprocessed form foo.ii that you could obtain with clang++ -C -E foo.cc > foo.ii
In the 1980s the preprocessor /lib/cpp was a separate program forked by the compiler (and some temporary foo.ii was sitting on the disk and removed at end of compilation). Today, it is -for performance reasons- some initial processing done inside the compiler. But you could reason as if it was still separate.
Either you want to alter the Clang compiler, and it deals (like every other C++ compiler or C++ static analyzer) mostly with the preprocessed form. Then you don't want to add new #include-s, but you want to alter the flow of AST given to the compiler (after preprocessing), and that is a different question: you then want to add some AST between existing AST elements (independently of any preprocessor directives).
Or you want to automatically change the C++ source code. The hard part is determining what you want to change and at what place. I suppose that you have used complex stuff to determine that a #include <vector> has to be inserted after line 34 of file foo.cc. Once you've got that information (and getting it is the hard thing), doing the insertion is pretty trivial. For example, you could read every C++ source line, and insert your line when you have read enough lines.
I'm going through the EUnit chapter in Learn You Some Erlang and one thing I am noticing from all the code samples is the test functions are never declared in -export() clauses.
Why is EUnit able to pick these test functions up?
From the documentation:
The simplest way to use EUnit in an Erlang module is to add the following line at the beginning of the module (after the -module declaration, but before any function definitions):
-include_lib("eunit/include/eunit.hrl").
This will have the following effect:
Creates an exported function test() (unless testing is turned off, and the module does not already contain a test() function), that can be used to run all the unit tests defined in the module
Causes all functions whose names match ..._test() or ..._test_() to be automatically exported from the module (unless testing is turned off, or the EUNIT_NOAUTO macro is defined)
Glad I found this question because it gives me a meaningful way to procrastinate and I was wondering how functions get created and exported dynamically.
Started by looking at the latest commit affecting EUnit in the Erlang/OTP Github repo, which is 4273cbd. (The only reason for this was to find a relatively stable anchor instead of git branches.)
0. Include EUnit's header file
According EUnit's User's Guide, the first step is to -include_lib("eunit/include/eunit.hrl"). in the tested module, so I assume this is where the magic happens.
1. otp/lib/eunit/include/eunit.hrl (lines 79 - 91)
%% Parse transforms for automatic exporting/stripping of test functions.
%% (Note that although automatic stripping is convenient, it will make
%% the code dependent on this header file and the eunit_striptests
%% module for compilation, even when testing is switched off! Using
%% -ifdef(EUNIT) around all test code makes the program more portable.)
-ifndef(EUNIT_NOAUTO).
-ifndef(NOTEST).
-compile({parse_transform, eunit_autoexport}).
-else.
-compile({parse_transform, eunit_striptests}).
-endif.
-endif.
1.1 What does -compile({parse_transform, eunit_autoexport}). mean?
From the Erlang Reference Manual's Module chapter (Pre-Defined Module Attributes):
-compile(Options).
Compiler options. Options is a single option or a list of options. This attribute is added to the option list when
compiling the module. See the compile(3) manual page in Compiler.
On to compile(3):
{parse_transform,Module}
Causes the parse transformation function
Module:parse_transform/2 to be applied to the parsed code before the
code is checked for errors.
From the erl_id_trans module:
This module performs an identity parse transformation of Erlang code.
It is included as an example for users who wants to write their own
parse transformers. If option {parse_transform,Module} is passed to
the compiler, a user-written function parse_transform/2 is called by
the compiler before the code is checked for errors.
Basically, if module M includes the {parse_transform, Module} compile option, then all of M's functions and attributes can be iterated through using your implementation of Module:parse_transform/2. Its first argument is Forms, which is M's module declaration described in Erlang's abstract format (described in Erlang Run-Time System Application (ERTS) User's Guide.
2. otp/lib/eunit/src/eunit_autoexport.erl
This module only exports parse_transfrom/2 to satisfy {parse_transform, Module} compile option and its first order of business is to figure out what are the configured suffixes for test case functions and generators. If not set manually, using _test and _test_ respectively (via lib/eunit/src/eunit_internal.hrl).
It then scans all the functions and attributes of your module using eunit_autoexport:form/5, and builds a list of to be exported functions where the suffixes above match (plus the original functions. I may be wrong on this one...).
Finally, eunit_autoexport:rewrite/2 builds a module declaration from the original Forms (given to eunit_autoexport:parse_transform/2 as the first argument) and the list of functions to be exported (that was supplied by form/5 above). On line 82 it injects the test/0 function mentioned in the EUnit documentation.
I have a static library, where one of the objects defines a symbol:
nm mylib.a
...
00007340 t _a_local_symbol
...
I need to access the function from my C code. Obviously, I don't have the source code for the library, so I can work only with the archive file that I have at hand.
This is further restricted by iOS linker.
A bit more context. The library is Objective-C++, the function in question is pure C. I don't have original headers, but I've got the function signature restored.
objcopy has a flag to do what you want:
--globalize-symbol <name> Force symbol <name> to be marked as a global
Not sure whether objcopy works on iOS object files though.