AST matcher on a specific node - clang

I wrote a AST matcher for finding specific type statements. In the matched nodes I calculated the neighbor siblings of that node. Now I need to run matcher on the neighbor nodes to verify they satisfies my condition or not.
The clang AST matcher matches the whole tree node one by one. I want to run matcher against a particular node and return true if the node matches my required condition.
Is this possible?

I suggest to implement your own matcher that will encapsulate the logic of finding neighbor nodes and matching them against other matchers.
I've put together the following matcher as an example of how that can be done:
using clang::ast_matchers::internal::Matcher;
constexpr auto AVERAGE_NUMBER_OF_NESTED_MATCHERS = 3;
using Matchers =
llvm::SmallVector<Matcher<clang::Stmt>, AVERAGE_NUMBER_OF_NESTED_MATCHERS>;
clang::Stmt *getNeighbor(const clang::Stmt &Node, clang::ASTContext &Context) {
// It is my naive implementation of this method, you can easily put your own
auto Parents = Context.getParents(Node);
if (Parents.size() != 1) {
return nullptr;
}
// As we deal with statements, let's assume that neighbor - is the next
// statement in the enclosing compound statement.
if (auto *Parent = Parents[0].get<clang::CompoundStmt>()) {
auto Neighbor = std::adjacent_find(
Parent->body_begin(), Parent->body_end(),
[&Node](const auto *Top, const auto *Bottom) { return Top == &Node; });
if (Neighbor != Parent->body_end()) {
return *std::next(Neighbor);
}
}
return nullptr;
}
AST_MATCHER_P(clang::Stmt, neighbors, Matchers, NestedMatchers) {
// Node is the current tested node
const clang::Stmt *CurrentNode = &Node;
// Our goal is to iterate over the given matchers and match the current node
// with the first matcher.
//
// Further on, we plan on checking whether the next
// matcher matches the neighbor/sibling of the previous node.
for (auto NestedMatcher : NestedMatchers) {
// This is how one can call a matcher to test one node.
//
// NOTE: it uses Finder and Builder, so it's better to do it from
// inside of a matcher and get those for free
if (CurrentNode == nullptr or
not NestedMatcher.matches(*CurrentNode, Finder, Builder)) {
return false;
}
// Here you can put your own implementation of finding neighbor/sibling
CurrentNode = getNeighbor(*CurrentNode, Finder->getASTContext());
}
return true;
}
I hope that the comments within the snippet cover the main ideas behind this matcher.
Demo
Matcher:
neighbors({declStmt().bind("first"), forStmt().bind("second"),
returnStmt().bind("third")})
Code snippet:
int foo() {
int x = 42;
int y = 10;
for (; x > y; --x) {
}
return x;
}
Output:
first:
DeclStmt 0x4c683e0
`-VarDecl 0x4c68360 used y 'int' cinit
`-IntegerLiteral 0x4c683c0 'int' 10
second:
ForStmt 0x4c684d0
|-<<<NULL>>>
|-<<<NULL>>>
|-BinaryOperator 0x4c68468 '_Bool' '>'
| |-ImplicitCastExpr 0x4c68438 'int' <LValueToRValue>
| | `-DeclRefExpr 0x4c683f8 'int' lvalue Var 0x4c682b0 'x' 'int'
| `-ImplicitCastExpr 0x4c68450 'int' <LValueToRValue>
| `-DeclRefExpr 0x4c68418 'int' lvalue Var 0x4c68360 'y' 'int'
|-UnaryOperator 0x4c684a8 'int' lvalue prefix '--'
| `-DeclRefExpr 0x4c68488 'int' lvalue Var 0x4c682b0 'x' 'int'
`-CompoundStmt 0x4c684c0
third:
ReturnStmt 0x4c68540
`-ImplicitCastExpr 0x4c68528 'int' <LValueToRValue>
`-DeclRefExpr 0x4c68508 'int' lvalue Var 0x4c682b0 'x' 'int'
I hope that answers your question!

Related

In constant expressions, operands of this operator must be of type 'bool' or 'int' on Dart using switch/case with enum

I have some simple code like this:
enum myEnum {
foo,
bar,
baz
}
mixin myMixin {
final Map<myEnum, double> values = {
myEnum.foo: 0.3,
myEnum.bar: 5.2,
myEnum.ter: 91.3,
};
double getValue(myEnum key, { double a = 0.0, double b = 0.0 }) {
final double tot = a + b;
double c = 0.0;
switch(key) {
case myEnum.foo | myEnum.bar:
-1;
break;
}
return values[key]!;
}
}
and it is generating this error:
In constant expressions, operands of this operator must be of type 'bool' or 'int'.
in the line case myEnum.foo | myEnum.bar: but I cannot get the reason for it.
Any idea?
You're not writing correct Dart.
The way to match two different values in a switch is to have two case clauses:
case myEnum.foo:
case myEnum.bar:
...
The | operator is not defined on enums (usually, with enhanced enums you can declare one, it's a user-definable operator), and since the expression after case must be a constant expression, and | in constant expressions is only allowed for integers (bitwise or) and Booleans (Boolean or), the expression myEnum.foo | myEnum.bar is invalid in a number of ways:
It's not constant.
myEnum doesn't have a | operator.
You only got one of the errors.

Matching std::optional<bool> with Clang AST

I'm trying to write a clang-tidy check for std::optional<bool> r = 5 to catch implicit conversions to bool.
|-DeclStmt <line:4:5, col:30>
| `-VarDecl <col:5, col:29> col:25 r 'std::optional<bool>':'std::optional<bool>' cinit
| `-ExprWithCleanups <col:29> 'std::optional<bool>':'std::optional<bool>'
| `-ImplicitCastExpr <col:29> 'std::optional<bool>':'std::optional<bool>' <ConstructorConversion>
| `-CXXConstructExpr <col:29> 'std::optional<bool>':'std::optional<bool>' 'void (int &&) noexcept(is_nothrow_constructible_v<bool, int>)'
| `-MaterializeTemporaryExpr <col:29> 'int' xvalue
| `-IntegerLiteral <col:29> 'int' 5
So far, I have match implicitCastExpr(hasDescendant(cxxConstructExpr())) where I'm matching for an implicitCastExpr with a cxxConstructoExpr. The problem is I want to narrow the match on cxxConstructExpr to find only cases where bool is the template argument. Does anyone know how to do this?
Inside cxxConstructExpr(...), you also need to use hasType, classTemplateSpecializationDecl, hasTemplateArgument, refersToType, and booleanType.
Here is a shell script invoking clang-query that finds implicit conversions to std::optional<bool> from a type other than bool:
#!/bin/sh
query='m
implicitCastExpr( # Implicit conversion
hasSourceExpression( # from a
cxxConstructExpr( # constructor expr
hasType( # whose type is
classTemplateSpecializationDecl( # a template specialization
hasName("::std::optional"), # of std::optional
hasTemplateArgument( # with template argument
0, refersToType(booleanType()) # bool,
)
)
),
unless( # unless
hasArgument( # the constructor argument
0, expr( # is an expr with
hasType( # type
booleanType() # bool.
)
)
)
)
)
)
)'
clang-query -c="$query" "$#"
(I use a shell script so I can format the query expression and add comments.)
Test input test.cc:
// test.cc
// Test clang-query finding implicit conversions to optional<bool>.
#include <optional> // std::optional
void f()
{
std::optional<bool> r = 5; // reported
std::optional<int> s = 6; // not reported
std::optional<bool> t = false; // not reported
}
// EOF
Invocation of the script (saved as cmd.sh):
$ ./cmd.sh test.cc -- --std=c++17
Match #1:
[...path...]/test.cc:8:27: note: "root" binds here
std::optional<bool> r = 5; // reported
^
1 match.
I used Clang+LLVM-14.0.0, although I don't think I've used anything particularly recent here.
Figuring out these match expressions can be quite difficult. The main reference is the AST Matcher Reference, but even with that, it often requires a lot of trial and error.

Type inference is dynamic on operator overloading with extension in Dart

I try to implement pipe-operator overriding |
extension Pipe on Object {
operator |(Function(Object) f) => f(this);
}
typedef Id = A Function<A>(A);
Id id = <A>(A a) => a;
var t1 = id("test"); // String t1
var t2 = "test" | id; // dynamic t2
with the generic Id function above, on id("test") is detected as String, but "test" | id is dynamic which is very problematic.
How can I fix this?
EDIT
Thankfully, #jamesdlin has answered and suggested:
extension Pipe on Object {
Object operator |(Object Function(Object) f) => f(this);
}
the result has improved as
var t2 = "test" | id; // Object t2
I also tried with generic as follows:
extension Pipe<A, B> on A {
B operator |(B Function(A) f) => f(this);
}
I expected it would go better because I thought the generic A B is more specific and better than Object ; however, the result goes as bad as before:
var t2 = "test" | id; // dynamic t2
Why the generic does not work? Is there any way to make the dart compiler infer it as string ?
Your operator | extension does not have a declared return type. Its return type therefore is implicitly dynamic. Also note its callback argument does not specify a return type either, so that also will be assumed to be dynamic.
Declare return types:
extension Pipe on Object {
Object operator |(Object Function(Object) f) => f(this);
}
(Answering your original question about a NoSuchMethodError: when you did
var x = "test" | id;
x | print;
x has type dynamic, but extension methods are static; they are compile-time syntactic sugar and will never work on dynamic types. Consequently, x | print attempts to call operator | on the object that x refers to, but that object doesn't actually have an operator |.)

Breaking head over how to get position of token with a rule - ANTLR4 / grammar

I'm writing a little grammar using ANLTR, and I have a rule like this:
operation : OPERATION (IDENT | EXPR) ',' (IDENT | EXPR);
...
OPERATION : 'ADD' | 'SUB' | 'MUL' | 'DIV' ;
IDENT : [a-z]+;
EXPR : INTEGER | FLOAT;
INTEGER : [0-9]+ | '-'[0-9]+
FLOAT : [0-9]+'.'[0-9]+ | '-'[0-9]+'.'[0-9]+
Now in the listener inside Java, how do I determine in the case of such a scenario where an operation consist of both IDENT and EXPR the order in which they appear?
Obviously the rule can match both
ADD 10, d
or
ADD d, 10
But in the listener for the rule, generated by ANTLR4, if there is both IDENT() and EXPR() how to get their order, since I want to assign the left and right operands correctly.
Been breaking my head over this, is there any simple way or should I rewrite the rule itself? The ctx.getTokens () requires me to give the token type, which kind of defeats the purpose, since I cannot get the sequence of the tokens in the rule, if I specify their type.
You can do it like this:
operation : OPERATION lhs=(IDENT | EXPR) ',' rhs=(IDENT | EXPR);
and then inside your listener, do this:
#Override
public void enterOperation(TParser.OperationContext ctx) {
if (ctx.lhs.getType() == TParser.IDENT) {
// left hand side is an identifier
} else {
// left hand side is an expression
}
// check `rhs` the same way
}
where TParser comes from the grammar file T.g4. Change this accordingly.
Another solution would be something like this:
operation
: OPERATION ident_or_expr ',' ident_or_expr
;
ident_or_expr
: IDENT
| EXPR
;
and then in your listener:
#Override
public void enterOperation(TParser.OperationContext ctx) {
Double lhs = findValueFor(ctx.ident_or_expr().get(0));
Double rhs = findValueFor(ctx.ident_or_expr().get(1));
...
}
private Double findValueFor(TParser.Ident_or_exprContext ctx) {
if (ctx.IDENT() != null) {
// it's an identifier
} else {
// it's an expression
}
}

How I can convert concrete syntax values to other kinds of values?

Given some concrete syntax value, how I can I map it to a different type of value (in this case an int)?
// Syntax
start syntax MyTree = \node: "(" MyTree left "," MyTree right ")"
| leaf: Leaf leaf
;
layout MyLayout = [\ \t\n\r]*;
lexical Leaf = [0-9]+;
This does not work unfortunately:
public Tree increment() {
MyTree tree = (MyTree)`(3, (1, 10))`;
return visit(tree) {
case l:(Leaf)`3` => l + 1
};
}
Or is the only way to implode into an ADT where I specified the types?
Your question has different possible answers:
using implode you can convert a parse tree to an abstract tree. If the constructors of the target abstract language expect int, then lexical trees which happen to match [0-9]+ will be automatically converted. For example the syntax tree for syntax Exp = intValue: IntValue; could be converted to constructor data Exp = intValue(int i); and it will actually build an i.
in general to convert one type of values to another in Rascal you write (mutually) recursive functions, as in int eval (MyTree t) and int (Leaf l).
if you want to actually increment the syntactic representation of a Leaf value, you have to convert back (parse or via a concrete pattern) from the resulting int back to the Leaf.
Example:
import String;
MyTree increment() {
MyTree tree = (MyTree)`(3, (1, 10))`;
return visit(tree) {
case Leaf l => [Leaf] "<toInt("<l>") + 1>";
};
}
First the lexical is converted to a string "<l>", this is then parsed as an int using toInt() and we add 1 using + 1 and then map the int back to a string "< ... >", after which we can call the Leaf parser using [Leaf].

Resources