Add #include's to the headers of a program using llvm clang - clang

I need to add headers to an already existing program by transforming it with LLVM and Clang.
I have used clang's rewriter to accomplish a similar thing in the changing function names and arguments, etc.
But the header files aren't present in clang's AST. I already know we need to use PPCallbacks (https://clang.llvm.org/doxygen/classclang_1_1PPCallbacks.html) but I am in dire need of some examples on how to make it work with the rewriter if at all possible.
Alternatively, adding a #include statement just before the first
using namespace <namespace>;
Also works. I would like to know an example of this as well.
Any help would be appreciated.

There is a bit of confusion in your question. You need to understand in details how the preprocessor works. Be aware that most of C++ compilation happens after the preprocessing phase (so most C++ static analyzers work after that phase).
In other words, the C++ specification (and also the C specification) defines first what is preprocessing, and then what is the syntax and the semantics of the preprocessed form.
In other words, when compiling foo.cc your compiler see the preprocessed form foo.ii that you could obtain with clang++ -C -E foo.cc > foo.ii
In the 1980s the preprocessor /lib/cpp was a separate program forked by the compiler (and some temporary foo.ii was sitting on the disk and removed at end of compilation). Today, it is -for performance reasons- some initial processing done inside the compiler. But you could reason as if it was still separate.
Either you want to alter the Clang compiler, and it deals (like every other C++ compiler or C++ static analyzer) mostly with the preprocessed form. Then you don't want to add new #include-s, but you want to alter the flow of AST given to the compiler (after preprocessing), and that is a different question: you then want to add some AST between existing AST elements (independently of any preprocessor directives).
Or you want to automatically change the C++ source code. The hard part is determining what you want to change and at what place. I suppose that you have used complex stuff to determine that a #include <vector> has to be inserted after line 34 of file foo.cc. Once you've got that information (and getting it is the hard thing), doing the insertion is pretty trivial. For example, you could read every C++ source line, and insert your line when you have read enough lines.

Related

Clang libtooling: how to print compiler macro definitions

I have a LibTooling based utility and I would like to output a list of macro definitions for debug purposes. One can print compiler macro definitions with clang/gcc -dM -E -, but it does not seem to work if I pass -dM -E or -dD to ClangTool. Is it possible to do so with LibTooling API or CLI options in any way? It does not matter if it will include macros defined in a parsed source code or not.
I've looked at other similar questions, and as far as I can tell they all are about macros expanded in a parsed source code. That's not exactly what I need.
The answer is blindingly obvious—in retrospect. clang::Preprocessor::getPredefines() returns just that—a list of compiler predefines. The preprocessor object can be obtained e.g. from clang::CompilerInstance, as an argument in clang::DiagnosticConsumer::BeginSourceFile(), etc.
Just for the sake of completeness, this functionality is not available via libclang API, but I could do that using the fact that all predefines are present in the beginning of a translation unit. So after parsing I just listed all top-level cursors of CursorKind.MACRO_DEFINITION that are not in any real location (location.file is None using python bindings notation)

Prevent ArmClang to add calls to Standard C library

I am evaluating Keil Microvision IDE on STM32H753.
I am doing compiler comparison between ARMCC5 and AC6 in the different optimisation levels. AC6 is based on Clang.
My code is not using memcpy and I have unchecked "Use MicroLIB" in the project settings , However a basic byte per byte copy loop in my code is replaced by a memcpy with AC6 (only in "high" optimisation levels). It doesn't happen with ARMCC5.
I tried using compilation options to avoid that, as described here: -ffreestanding and -disable-simplify-libcalls, at both compiler and linker levels but it didn't change (for the second option, I get an error message saying that the option is not supported).
In the ARMCLANG reference guide i've found the options -nostdlib -nostdlibinc that prevent (??) the compiler to use any function of a standard lib.
However I still need the math.h function.
Do you know how to prevent clang to use functions from the Standard C Lib that are not explicitely called in the code ?
EDIT: here is a quick and dirty reproduceable example:
https://godbolt.org/z/AX8_WV
Please do not discuss the quality of this example, I know it is dumb !!, I know about memset, etc... It is just to understand the issue
gcc know a lot about the memcpy, memset and similar functions and even they are called "the builtin functions". If you do not want those functions to be used by default just use the command line option -fno-builtin
https://godbolt.org/z/a42m4j

Parse c-clang index.h file with with clang itself

I am trying to parse c-clang index.h file with ClangSharp (just for testing purposes of ClangSharp parser on C#) and I found that it misses parsing of functions because of CINDEX_LINKAGE macro in the function declaration.
If I remove it, parser will correctly find FunctionDecl and parse it without errors.
I cannot understand how this macro preventing functions from being parsed. Does someone know how to workaround this?
Issue was in the #include line itself. By default, clang header includes setup to search in the directory on one level up, but clang itself by some reason does not understand such
include format.

AST of a project by Clang

I use the Clang python binding to extract the AST of c/c++ files. It works perfectly for a simple program I wrote. The problem is when I want to employ it for a big project like openssl. I can run clang for any single file of the project, but clang seems to miss some headers of the project, and just gives me the AST of a few functions of the file, not all of the functions. I set the include folder by -I, but still getting part of the functions.
This is my code:
import clang.cindex as cl
cl.Config.set_library_path(clang_lib_dir)
index = cl.Index.create()
lib = 'Path to include folder'
args = ['-I{}'.format(lib)]
translation_unit = index.parse(source_file, args=args)
my_get_info(translation_unit.cursor)
I receive too many header files not found errors.
UPDATE
I used Make to compile openssl by clang? I can pass -emit-ast option to clang to dump the ast of each file, but I cannot read it now by the clang python binding.
Any clues how I can save the the serialized representation of the translation units so that I will be able to read it by index.read()?
Thank you!
You would "simply" need to provide the right args. But be aware of two possible issues.
Different files may require different arguments for parsing. The easiest solution is to obtain compilation database and then extract compile commands from it. If you go this way be aware that you would need to filter out the arguments a bit and remove things like -c FooBar.cpp (potentially some others), otherwise you may get something like ASTReadError.
Another issue is that the include paths (-I ...) may be relative to the source directory. I.e., if a file main.cpp compiled from a directory /opt/project/ with -I include/path argument, then before calling index.parse(source_file, args=args) you need to step in (chdir) into the /opt/project, and when you are done you will probably need to go back to the original working directory. So the code may look like this (pseudocode):
cwd = getcwd()
chdir('/opt/project')
translation_unit = index.parse(source_file, args=args)
chdir(cwd)
I hope it helps.

Vala - Equation parsing

I am trying to learn a bit about Vala and wanted to create a Calculator to test how Gtk worked. The problem is that I coded everything around the supposition that there would be a way to parse a string that contained the required operations. Something like this:
string operation = "5+2/3*4"
I have done this with Python and it is as simple as using the compilers parser. I understand Python is math oriented, but I thought that perhaps there would be Vala library waiting for me as an answer... I haven't found it if it does exist, but as I was looking at the string documentation, I noticed this part:
/* Strings prefixed with '#' are string templates. They can evaluate
* embedded variables and expressions prefixed with '$'.
* Since Vala 0.7.8.
*/
string name = "Dave";
println (#"Good morning, $name!");
println (#"4 + 3 = $(4 + 3)");
So... I thought that maybe there was a way to make it work that way, maybe something like this:
stdout.printf(#"$(operation)")
I understand that this is not an accurate supposition as it will just substitute the variable and require a further step to actually evaluate it.
Right now the two main doubts I am having are: a) Is there a library function capable of doing this? and b) Is it possible to work out a solution using string templates?
Here's something I found that would do the work. I used the C++ libmatheval library, for this I first required a vapi file to bind it to Vala. Which I found here. There are a lot of available vapi files under the project named vala-extra-apis, and they are recognized in GNOME's Vala List of Bindings although they are not included at install.
You could parse the expression using libvala (which is part of the compiler).
The compiler creates a CodeContext and runs the Vala parser over a (or several) .vala file(s).
You could then create your own CodeVisitor decendant class that visits the necessary nodes of the parse tree and evaluates expressions.
As far as I can see there is no expression evaluator that does this, yet. That is because normally vala code is translated to C code and the C compiler then does compile time expression evaluation or the finished executable does the run time evaluation.
Python is different, because it is primarily a scripting language and has evaluation build directly into the runtime / interpreter.

Resources