AST of a project by Clang - clang

I use the Clang python binding to extract the AST of c/c++ files. It works perfectly for a simple program I wrote. The problem is when I want to employ it for a big project like openssl. I can run clang for any single file of the project, but clang seems to miss some headers of the project, and just gives me the AST of a few functions of the file, not all of the functions. I set the include folder by -I, but still getting part of the functions.
This is my code:
import clang.cindex as cl
cl.Config.set_library_path(clang_lib_dir)
index = cl.Index.create()
lib = 'Path to include folder'
args = ['-I{}'.format(lib)]
translation_unit = index.parse(source_file, args=args)
my_get_info(translation_unit.cursor)
I receive too many header files not found errors.
UPDATE
I used Make to compile openssl by clang? I can pass -emit-ast option to clang to dump the ast of each file, but I cannot read it now by the clang python binding.
Any clues how I can save the the serialized representation of the translation units so that I will be able to read it by index.read()?
Thank you!

You would "simply" need to provide the right args. But be aware of two possible issues.
Different files may require different arguments for parsing. The easiest solution is to obtain compilation database and then extract compile commands from it. If you go this way be aware that you would need to filter out the arguments a bit and remove things like -c FooBar.cpp (potentially some others), otherwise you may get something like ASTReadError.
Another issue is that the include paths (-I ...) may be relative to the source directory. I.e., if a file main.cpp compiled from a directory /opt/project/ with -I include/path argument, then before calling index.parse(source_file, args=args) you need to step in (chdir) into the /opt/project, and when you are done you will probably need to go back to the original working directory. So the code may look like this (pseudocode):
cwd = getcwd()
chdir('/opt/project')
translation_unit = index.parse(source_file, args=args)
chdir(cwd)
I hope it helps.

Related

How to use --save_temps in Bazel rule instead of command line?

Is there a way to control the Bazel build to generate wanted temp files for a list of source files instead of just using the command line option "--save_temps"?
One way is using a cc_binary, and add "-E" option in the "copts", but the obj file name will always have a ".o". This kind of ".o" files will be overwriten by the other build targets. I don't know how to control the compiler output file name in Bazel.
Any better ideas?
cc_library has an output group with the static library, which you can then extract. Something like this:
filegroup(
name = "extract_archive",
srcs = [":some_cc_library"],
output_group = "archive",
)
Many tools will accept the static archive instead of an object file. If the tool you're using does, then that's easy. If not, things get a bit more complicated.
Extracting the object file from the static archive is a bit trickier. You could use a genrule with the $(AR) Make variable, but that won't work with some C++ toolchains that require additional flags to configure architectures etc.
The better (but more complicated) answer is to follow the guidance in integrating with C++ rules. You can get the ar from the toolchain and the flags to use it in a custom rule, and then create an action to extract it. You could also access the OutputGroupInfo from the cc_library in the rule directly instead of using filegroup if you've already got a custom rule.
Thanks all for your suggestions.
Now I think I can solve this problem in two steps(Seems Bazel does not allow to combine two rules into one):
Step1, add a -E option like a normal cc_libary, we can call it a pp_library. It is easy.
Step2, in a new rules, its input is the target of pp_library, then in this rule find out the obj files(can be found via : action.outputs.to_list()) and copy them to the a new place via ctx.actions.run_shell() run_shell.
I take Bazel: copy multiple files to binary directory as a reference.

Add 'library' directive to dart code generated using protoc

Can someone tell me how to get protoc to generate dart files with a leading library directive?
I'm using the dart-protoc-plugin (v0.10.2) to generate my dart, c++, c#, js and java models from proto files. I was under the impression there was no way to get protoc to add a 'library' directive to the generated dart files, until I noticed the directive appearing in another project (see date.pb.dart).
If I take the same file (date.proto) I cannot get protoc to generate a dart file containing a 'library' directive.
In short: I want to take a .proto file with the following content
syntax = "proto3";
package another.proj.nspace;
message MyObj {
...
}
and produce a .dart file with a leading 'library' directive similar to the following snippet
///
// Generated code. Do not modify.
///
// ignore_for_file: non_constant_identifier_names,library_prefixes
library another.proj.nspace;
...
NOTE: I don't care about the actual value of the directive since I can restructure my code to get the desired result. I just need a way for protoc to add the library directive...
The basic command I'm using to generate the dart files is
protoc --proto_path=./ --dart_out="./" ./another/proj/nspace/date.proto
Unfortunately the dart-protoc-plugin's README isn't very helpful and I had to go through the source to find out which options are available; and currently it seems like the only dart-specific option is related to grpc.
I've tried options from the other languages (e.g. 'library', and 'basepath') without any success.
It would simplify my workflow quite a bit if this is possible, but I'm starting to get the impression that the library directive in date.pb.dart is added after the code was generated...
After asking around a little bit, it seems that the library directive was removed from the protoc plugin at some stage (see pull request), thus it is no longer supported.

Add #include's to the headers of a program using llvm clang

I need to add headers to an already existing program by transforming it with LLVM and Clang.
I have used clang's rewriter to accomplish a similar thing in the changing function names and arguments, etc.
But the header files aren't present in clang's AST. I already know we need to use PPCallbacks (https://clang.llvm.org/doxygen/classclang_1_1PPCallbacks.html) but I am in dire need of some examples on how to make it work with the rewriter if at all possible.
Alternatively, adding a #include statement just before the first
using namespace <namespace>;
Also works. I would like to know an example of this as well.
Any help would be appreciated.
There is a bit of confusion in your question. You need to understand in details how the preprocessor works. Be aware that most of C++ compilation happens after the preprocessing phase (so most C++ static analyzers work after that phase).
In other words, the C++ specification (and also the C specification) defines first what is preprocessing, and then what is the syntax and the semantics of the preprocessed form.
In other words, when compiling foo.cc your compiler see the preprocessed form foo.ii that you could obtain with clang++ -C -E foo.cc > foo.ii
In the 1980s the preprocessor /lib/cpp was a separate program forked by the compiler (and some temporary foo.ii was sitting on the disk and removed at end of compilation). Today, it is -for performance reasons- some initial processing done inside the compiler. But you could reason as if it was still separate.
Either you want to alter the Clang compiler, and it deals (like every other C++ compiler or C++ static analyzer) mostly with the preprocessed form. Then you don't want to add new #include-s, but you want to alter the flow of AST given to the compiler (after preprocessing), and that is a different question: you then want to add some AST between existing AST elements (independently of any preprocessor directives).
Or you want to automatically change the C++ source code. The hard part is determining what you want to change and at what place. I suppose that you have used complex stuff to determine that a #include <vector> has to be inserted after line 34 of file foo.cc. Once you've got that information (and getting it is the hard thing), doing the insertion is pretty trivial. For example, you could read every C++ source line, and insert your line when you have read enough lines.

Generating a Vapi file for a Vala library

I've got a library written in Vala that has always worked fine generating a .vapi file for itself, I think because it's a free operation with valac but I'm not positive on that. I went and tried to use VAPIGEN_CHECK in my configure.ac file and the associated VAPIGEN_MAKEFILE in my Makefile.am and now I get:
error: The type name `GLib.TypeInstance' could not be found
My corresponding .gir file contains:
<field name="parent_instance">
<type name="GObject.TypeInstance" c:type="GTypeInstance"/>
</field>
So the error seems to make sense because I can't find the GObject.TypeInstance class/struct in any .vapi file, but GTypeInstance is in one of the GLib headers.
Should I even be doing it this way if I'm writing everything in Vala already? Is there a possibility that this is missing from the Vapi?
Edit: Possibly just due to my not deriving GLib.Object which I thought was implicit. Still trying to fix something else that prevents this but once that's done I will update this to say whether or not it actually matters.
To generate a VAPI file from a Vala program you should simply use the --vapi option with valac, e.g.:
valac --vapi my_library_name.vapi my_library.vala
From what you are describing I think you are generating a GIR (GObject Introspection Repository) file with valac, then using vapigen to create the VAPI file. vapigen is part of Vala and maintained in the Vala source code, but it is a tool for generating a VAPI file to bind to non-Vala projects. If the non-Vala project distributes a GIR file it makes the binding very easy.
When using vapigen you need to give the packages it uses, so you need to check you are including the right pkg-config flags, e.g.:
vapigen --pkg glib-2.0 --pkg gobject-2.0 my_library.gir
The other possibility is there is no binding for GTypeInstance in Vala. I've had a quick look and I'm not finding anything.

[perl]How to force perl use modules in my own path?

I want to let perl use the DBI module in my own path(suppose, /home/users/zdd/perl5/lib/DBI), but the sysem also has a DBI module which is /usr/lib/perl5/lib/DBI.
when I write the following code in my script, perl use the system path be default, how to force it use the one under my path?
use lib './perl5/lib/DBI';
use DBI;
sub test {
....
}
/usr/lib/perl5/lib/DBI was added to the PATH environment variable in my bash profile, it was used by many scripts, so I can't disable it.
The file for the main DBI module is in ./perl5/lib. So your path is not pointing to it.
The DBI folder contains sub-modules of DBI, e.g. DBI::Foo (the :: in module names is a representation of your module directory structure).
Try using ./perl5/lib as your library instead.
Also, using a relative path will fail if the current directory is not what you think it is. If you are in doubt, have your script call cwd to see what the current directory is.
For debugging purposes, it may be helpful to use:
no lib '[main Perl module library path here]';
That way you can be sure you are only using your custom module path. Any failure to find a module will cause an error, rather than silently using the system version.
Update: For more information, see Perldoc on use lib. Perl will use the library that you have specified first. If it does not, that indicates it is not actually finding the module in the location you have given.
In addition to what dan1111 suggested, I would also recommend you print out #INC (just before your use DBI statement) and dump %INC (just after your use DBI statement) to see what your script is doing. That may help you debug the issue.

Resources