How to clone or create an AST Stmt node of clang? - clang

I want to modify the AST by clang LibTooling. How can I clone an AST node or add a new one, e.g. I'd like to create a BinaryOperator with ADD opcode

Some of the clang's AST nodes (classes) have a static Create method that is used to allocate an instance of that node whose memory is managed by the ASTContext instance passed to it. For these classes you may use this method for instantiation purposes. Check clang::DeclRefExpr class for instance.
Other classes miss this method but have public constructor that you may be used to instantiate the object. However, the vanilla new and delete operators are purposely hidden so you cannot use them to instantiate the objects on the heap. Instead, you must use placement new/delete operators providing ASTContext instance as an argument.
Personally, I prefer to allocate all clang related objects using ASTContext instance and let it manage the memory internally so I don't have to bother with it (all memory will get released when ASTContext instance gets destroyed).
Here is a simple class that allocates the memory for the clang object using the placement new operator and ASTContext instance:
#ifndef CLANG_ALLOCATOR_H
#define CLANG_ALLOCATOR_H
#include <clang/AST/ASTContext.h>
/// Allocator that relies on clang's AST context for actual memory
/// allocation. Any class that wishes to allocated an AST node may
/// create an instance of this class for that purpose
class ClangAllocator
{
public:
explicit ClangAllocator(clang::ASTContext& ast_context)
: m_ast_context(ast_context)
{
}
template<class ClassType, class ... Args>
inline ClassType* Alloc(Args&& ... args)
{
return new (m_ast_context) ClassType(std::forward<Args&&>(args)...);
}
private:
clang::ASTContext& m_ast_context;
};
#endif /// CLANG_ALLOCATOR_H
Regarding the AST modifications, probably the best way to accomplish this is to inherit the TreeTransform class and override its Rebuild methods that are invoked to produce new statements for various AST nodes.
If all you require is to replace one AST node with another, very simple way to achieve this is to find its immediate parent statement and then use std::replace on its children. For example:
/// immediate_parent is immediate parent of the old_stmt
std::replace(
immediate_parent->child_begin()
, immediate_parent->child_end()
, old_stmt
, new_stmt);

Creating new AST nodes is quite cumbersome in Clang, and it's not the recommended way to use libTooling. Rather, you should "read" the AST and emit back code, or code changes (rewritings, replacements, etc).
See this article and other articles (and code samples) linked from it for more information on the right way to do this.

Related

Representing a class table in Rascal

I would like to represent a kind of class table (CT) as a singleton in Rascal, so that some transformations might refer to the same CT. Since not all transformations need to refer to the CT (and I prefer not to change the signature of the existing transformations), I was wondering if it is possible to implement a kind of singleton object in Rascal.
Is there any recommendation for representing this kind of situation?
Edited: found a solution, though still not sure if this is the idiomatic Rascal approach.
module lang::java::analysis::ClassTable
import Map;
import lang::java::m3::M3Util;
// the class table considered in the source
// code analysis and transformations.
map[str, str] classTable = ();
/**
* Load a class table from a list of JAR files.
* It uses a simple cache mechanism to avoid loading the
* class table each time it is necessary.
*/
map[str, str] loadClassTable(list[loc] jars) {
if(size(classTable) == 0) {
classTable = classesHierarchy(jars);
}
return classTable;
}
Two answers to the question: "what to do if you want to share data acros functions and modules, but not pass the data around as an additional parameter, or as an additional return value?":
a public global variable can hold a reference to a shared data-object like so: public int myGlobalInt = 666; This works for all kinds of (complex) data, including class tables. Use this only if you need shared state of the public variable.
a #memo function is a way to provide fast access to shared data in case you need to share data which will not be modified (i.e. you do not need shared state): #memo int mySharedDataProvider(MyType myArgs) = hardToGetData();. The function's behavior must not have side-effects, i.e. be "functional", and then it will never recompute the return value for earlier provided arguments (instead it will use an internal table to cache previous results).

Vala object class lighter than normal class

"What happens if you don't inherit from Object? Nothing terrible. These classes will be slightly more lightweight, however, they will lack some features such as property change notifications, and your objects won't have a common base class. Usually inheriting from Object is what you want." Vala team said.
So I wanted to know how light the classes are with or without inheriting form Object.
So, Here are my test files
test1.vala:
class Aaaa : Object {
public Aaaa () { print ("hello\n"); }
}
void main () { new Aaaa (); }
test2.vala:
class Aaaa {
public Aaaa () { print ("hello\n"); }
}
void main () { new Aaaa (); }
The results after the compilation was totally unexpected, the size of test1 is 9.3 kb and the size of test2 is 14.9 kb and that contradicts what they said. Can someone explain this please?
You are comparing the produced object code / executable size, but that's not what the statement from the tutorial was referring to.
It is refering to the features that your class will support. It's just clarifying that you don't get all the functionality that GLib.Object / GObject provides.
In C# (and in Java, too?) the type system is "rooted" which means all classes always derive implicitly from System.Object. That is not the case for Vala. Vala classes can be "stand alone" classes which means that these stand alone classes don't have any parent class (not even GLib.Object / GObject).
The code size is bigger, because the stand alone class doesn't reuse any functionality from GLib.Object / GObject (which is implemented in glib), so the compiler has to output more boiler plate code (writing classes in C is always involving a lot of boiler plate code).
You can compare yourself with "valac -C yourfile.vala" which will produce a "yourfile.c" file.
That's a very interesting question. The answer will get you deep into how GObjects work. With these kinds of questions a useful feature of valac is to use the --ccode switch. This will produce the C code, instead of the binary. If you look at the C code of the second code sample, which doesn't inherit from Object, it includes a lot more functions, such as aaaa_ref and aaaa_unref. These are basic functions used to handle objects in GLib's object system. When you inherit from Object these functions are already defined in the parent class so the C code and resulting binary are smaller.
By just using class without inheriting from Object you are creating your own GType, but not inheriting all the features of Object so in that sense your classes are lighter weight. This makes them quicker to instantiate. If you time how long it takes to create a huge number of GType objects compared to the same number of GObject inheriting objects you should see the GType object being created more quickly. As you have pointed out GType objects lose some additional features. So the choice depends on your application.

iOS circular dependencies between block definitions

In my iOS application I want to define two block types that take each other as parameters:
typedef void (^BlockA)(BlockB b);
typedef void (^BlockB)(BlockA a);
This fails compilation with 'Unknown type name BlockB in the first typedef (that makes sense).
I have a workaround which defines the types like this:
typedef void (^BlockA)(id);
typedef void (^BlockB)(BlockA a);
I then cast back to the BlockB type inside the BlockA definition, but at the expense of type safety.
I also looked at not using typedefs, but this results in an infinite nesting of expanded block definitions.
I know how to resolve circular dependencies for structs with forward declarations, but I can't see how to do this with blocks.
If there is no solution to the circular dependency, is there a way that I can restrict the parameter to BlockA to be any Block type rather than the generic id, this would give some level of type safety.
typedef does not define a "real" type. It's basically like a macro that expands out everywhere it's used. That's why typedefs cannot be recursive.
Another way to think about it is, typedefs are never necessary -- you can always take any piece of code with a typedef, and simply replace every occurrence of it with the underlying type (that's what the compiler does when you compile), and it will always work and be completely equivalent. Think about it -- how would you do it without a typedef? You can't. So you can't do it with a typedef either.
The only ways to do it are: use id as the argument type to erase the type like you're doing; or, encapsulate the block inside a "real" type like a struct or class. However, if you do it the latter way, you have to explicitly put the block into and extract the block out of the struct or class, which makes the code confusing. Also, struct is dangerous as struct is a scalar C type, and if you need to capture it by a block, it doesn't automatically memory-manage the objects inside the struct. As for class, defining a wrapping class is very verbose, and using it for this causes the allocation of an extraneous dummy object for every block it wraps.
In my opinion, using id like you're using is fine and is the cleanest way. However, keep in mind that if you need to have that block passed as id be captured by another inner block, that you should cast it back to the block type before being captured, as the capturing semantics is different for block and other object types (blocks are copied, whereas other objects are retained). Just casting it back to the block type at the earliest place will work.

Java EE 6 : #Inject and Instance<T>

I have a question about the #Inject annotation in java ee 6 :
What is the difference between :
#Inject
private TestBean test;
#Inject
private Instance<TestBean> test2;
To have the reference :
test2.get();
Some infos about Instance : http://docs.oracle.com/javaee/6/api/javax/enterprise/inject/Instance.html
Maybe it's doesnt create the object until it's called by get() ? I just wanted to know which one is better for the jvm memory. I think direct #Inject will directly create an instance of the object , even if it's not used by the appplication...
Thank you !
The second is what's called deferred injection or initialization. Your container will elect do do the work of locating, initializing, and injecting the proper object for TestBean until you call get() in most circumstances.
As far as "which one is better", you should defer to the rules of optimization. Don't optimize until you have a problem, and use a profiler.
Another words, use the first one unless you can definitively prove the second one is saving you significant amounts of memory and cpu.
Let me know if that answers your question!
Further information on use cases for Instance can be found in documentation:
In certain situations, injection is not the most convenient way to obtain a contextual reference. For example, it may not be used when:
the bean type or qualifiers vary dynamically at runtime
there may be no bean which satisfies the type and qualifiers
we would like to iterate over all beans of a certain type
This is pretty cool so you can do something like
#Inject #MyQualifier Instance<MyType> allMycandidates;
So you can obtain an Iterator from allMyCandidates and iterate over all the qualified objects.

F# mutual recursion between modules

For recursion in F#, existing documentation is clear about how to do it in the special case where it's just one function calling itself, or a group of physically adjacent functions calling each other.
But in the general case where a group of functions in different modules need to call each other, how do you do it?
I don't think there is a way to achieve this in F#. It is usually possible to structure the application in a way that doesn't require this, so perhaps if you described your scenario, you may get some useful comments.
Anyway, there are various ways to workaround the issue - you can declare a record or an interface to hold the functions that you need to export from the module. Interfaces allow you to export polymorphic functions too, so they are probably a better choice:
// Before the declaration of modules
type Module1Funcs =
abstract Foo : int -> int
type Module2Funcs =
abstract Bar : int -> int
The modules can then export a value that implements one of the interfaces and functions that require the other module can take it as an argument (or you can store it in a mutable value).
module Module1 =
// Import functions from Module2 (needs to be initialized before using!)
let mutable module2 = Unchecked.defaultof<Module2Funcs>
// Sample function that references Module2
let foo a = module2.Bar(a)
// Export functions of the module
let impl =
{ new Module1Funcs with
member x.Foo(a) = foo a }
// Somewhere in the main function
Module1.module2 <- Module2.impl
Module2.module1 <- Module1.impl
The initializationcould be also done automatically using Reflection, but that's a bit ugly, however if you really need it frequently, I could imagine developing some reusable library for this.
In many cases, this feels a bit ugly and restructuring the application to avoid recursive references is a better approach (in fact, I find recursive references between classes in object-oriented programming often quite confusing). However, if you really need something like this, then exporting functions using interfaces/records is probably the only option.
This is not supported. One evidence is that, in visual stuido, you need to order the project files correctly for F#.
It would be really rare to recursively call two functions in two different modules.
If this case does happen, you'd better factor the common part of the two functions out.
I don't think that there's any way for functions in different modules to directly refer to functions in other modules. Is there a reason that functions whose behavior is so tightly intertwined need to be in separate modules?
If you need to keep them separated, one possible workaround is to make your functions higher order so that they take a parameter representing the recursive call, so that you can manually "tie the knot" later.
If you were talking about C#, and methods in two different assemblies needed to mutually recursively call each other, I'd pull out the type signatures they both needed to know into a third, shared, assembly. I don't know however how well those concepts map to F#.
Definetely solution here would use module signatures. A signature file contains information about the public signatures of a set of F# program elements, such as types, namespaces, and modules.
For each F# code file, you can have a signature file, which is a file that has the same name as the code file but with the extension .fsi instead of .fs.

Resources