Matching std::optional<bool> with Clang AST - clang

I'm trying to write a clang-tidy check for std::optional<bool> r = 5 to catch implicit conversions to bool.
|-DeclStmt <line:4:5, col:30>
| `-VarDecl <col:5, col:29> col:25 r 'std::optional<bool>':'std::optional<bool>' cinit
| `-ExprWithCleanups <col:29> 'std::optional<bool>':'std::optional<bool>'
| `-ImplicitCastExpr <col:29> 'std::optional<bool>':'std::optional<bool>' <ConstructorConversion>
| `-CXXConstructExpr <col:29> 'std::optional<bool>':'std::optional<bool>' 'void (int &&) noexcept(is_nothrow_constructible_v<bool, int>)'
| `-MaterializeTemporaryExpr <col:29> 'int' xvalue
| `-IntegerLiteral <col:29> 'int' 5
So far, I have match implicitCastExpr(hasDescendant(cxxConstructExpr())) where I'm matching for an implicitCastExpr with a cxxConstructoExpr. The problem is I want to narrow the match on cxxConstructExpr to find only cases where bool is the template argument. Does anyone know how to do this?

Inside cxxConstructExpr(...), you also need to use hasType, classTemplateSpecializationDecl, hasTemplateArgument, refersToType, and booleanType.
Here is a shell script invoking clang-query that finds implicit conversions to std::optional<bool> from a type other than bool:
#!/bin/sh
query='m
implicitCastExpr( # Implicit conversion
hasSourceExpression( # from a
cxxConstructExpr( # constructor expr
hasType( # whose type is
classTemplateSpecializationDecl( # a template specialization
hasName("::std::optional"), # of std::optional
hasTemplateArgument( # with template argument
0, refersToType(booleanType()) # bool,
)
)
),
unless( # unless
hasArgument( # the constructor argument
0, expr( # is an expr with
hasType( # type
booleanType() # bool.
)
)
)
)
)
)
)'
clang-query -c="$query" "$#"
(I use a shell script so I can format the query expression and add comments.)
Test input test.cc:
// test.cc
// Test clang-query finding implicit conversions to optional<bool>.
#include <optional> // std::optional
void f()
{
std::optional<bool> r = 5; // reported
std::optional<int> s = 6; // not reported
std::optional<bool> t = false; // not reported
}
// EOF
Invocation of the script (saved as cmd.sh):
$ ./cmd.sh test.cc -- --std=c++17
Match #1:
[...path...]/test.cc:8:27: note: "root" binds here
std::optional<bool> r = 5; // reported
^
1 match.
I used Clang+LLVM-14.0.0, although I don't think I've used anything particularly recent here.
Figuring out these match expressions can be quite difficult. The main reference is the AST Matcher Reference, but even with that, it often requires a lot of trial and error.

Related

How can I match universal reference arguments to a member function using Clang AST matchers?

I'm trying to match arguments passed to a templated member function invocation using clang-query as a precursor to writing a clang-tidy check. Whilst I can get non-templated member functions to match, I'm unable to get templated member functions to. I'm using clang-14.0.0.
Consider:
#include <string>
class BaseTrace {
public:
template <typename... Args>
void Templated(const char *fmt, Args &&...args) {
}
void NotTemplated(const char *fmt, const char *s1, const char *s2) {
}
};
BaseTrace TRACE;
void trace1(const std::string &s1, const std::string &s2)
{
TRACE.Templated("One:{} Two:{}\n", s1.c_str(), s2.c_str());
}
void trace2(const std::string &s1, const std::string &s2)
{
TRACE.NotTemplated("One:{} Two:{}\n", s1.c_str(), s2.c_str());
}
Querying for normal member invocation function matches as expected:
clang-query> match match cxxMemberCallExpr(callee(functionDecl(hasName("NotTemplated"))), on(expr(hasType(cxxRecordDecl(hasName("::BaseTrace"))))), hasAnyArgument(cxxMemberCallExpr()))
Match #1:
/home/mac/git/llvm-project/build/../clang-tools-extra/test/clang-tidy/checkers/minimal-cstr.cpp:23:3: note: "root" binds here
TRACE.NotTemplated("One:{} Two:{}\n", s1.c_str(), s2.c_str());
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 match.
But querying for a templated member function invocation does not:
clang-query> match match cxxMemberCallExpr(callee(functionDecl(hasName("Templated"), isTemplateInstantiation())), on(expr(hasType(cxxRecordDecl(hasName("::BaseTrace"))))), hasAnyArgument(cxxMemberCallExpr()))
0 matches.
The answer to this question implies that this ought to work. If I don't try to match the arguments then I am able to get a match:
clang-query> match cxxMemberCallExpr(callee(functionDecl(hasName("Templated"), isTemplateInstantiation())), on(anyOf(expr(hasType(cxxRecordDecl(hasName("::BaseTrace")))), expr(hasType(cxxRecordDecl(isDerivedFrom("::BaseTrace")))))))
Match #1:
/home/mac/git/llvm-project/build/../clang-tools-extra/test/clang-tidy/checkers/minimal-cstr.cpp:16:3: note: "root" binds here
TRACE.Templated("One:{} Two:{}\n", s1.c_str(), s2.c_str());
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 match.
I don't seem to have the same problem when matching non-member functions.
Am I missing something, or is this potentially a clang limitation.
I'm not convinced that I've got to the bottom of this, but it appears that I can make the Templated method match by either using:
clang-query> set traversal IgnoreUnlessSpelledInSource
rather than the AsIs default with the match expression in the question, or by modifying the match expression to be:
clang-query> match cxxMemberCallExpr(callee(functionDecl(hasName("Templated"), isTemplateInstantiation())), on(expr(hasType(cxxRecordDecl(hasName("::BaseTrace"))))), hasAnyArgument(materializeTemporaryExpr(has(cxxMemberCallExpr()))))
to match the extra MaterializeTemporaryExpr node in the syntax tree.

value is a function while a set was expected while evaluating 'outputs'

I'm getting the above error when attempting to check a flake; I'm trying to use flake-compat on a non-NixOS system for compatibility with home-manager.
This is the flake that's causing the trace below:
error: value is a function while a set was expected
at /nix/store/l22dazwy8cgxdvndhq45br310nap92x3-source/etc/nixos/flake.nix:167:136:
166|
167| outputs = inputs#{ self, nix, nixpkgs, flake-utils, flake-compat, ... }: with builtins; with nixpkgs.lib; with flake-utils.lib; let
|
^
168|
… while evaluating 'outputs'
at /nix/store/l22dazwy8cgxdvndhq45br310nap92x3-source/etc/nixos/flake.nix:167:15:
166|
167| outputs = inputs#{ self, nix, nixpkgs, flake-utils, flake-compat, ... }: with builtins; with nixpkgs.lib; with flake-utils.lib; let
| ^
168|
… from call site
at «string»:45:21:
44|
45| outputs = flake.outputs (inputs // { self = result; });
| ^
46|
… while evaluating anonymous lambda
at «string»:10:13:
9| builtins.mapAttrs
10| (key: node:
| ^
11| let
… from call site
… while evaluating the attribute 'root'
… while evaluating anonymous lambda
at «string»:2:23:
1|
2| lockFileStr: rootSrc: rootSubdir:
| ^
3|
… from call site
Unfortunately, I cannot provide a minimal reproducible example as I do not know from where in the flake this error is originating.
Turns out, my lib value was actually a function; unfortunately, since nix flakes is still unstable, it didn't quite show where this was happening.

AST matcher on a specific node

I wrote a AST matcher for finding specific type statements. In the matched nodes I calculated the neighbor siblings of that node. Now I need to run matcher on the neighbor nodes to verify they satisfies my condition or not.
The clang AST matcher matches the whole tree node one by one. I want to run matcher against a particular node and return true if the node matches my required condition.
Is this possible?
I suggest to implement your own matcher that will encapsulate the logic of finding neighbor nodes and matching them against other matchers.
I've put together the following matcher as an example of how that can be done:
using clang::ast_matchers::internal::Matcher;
constexpr auto AVERAGE_NUMBER_OF_NESTED_MATCHERS = 3;
using Matchers =
llvm::SmallVector<Matcher<clang::Stmt>, AVERAGE_NUMBER_OF_NESTED_MATCHERS>;
clang::Stmt *getNeighbor(const clang::Stmt &Node, clang::ASTContext &Context) {
// It is my naive implementation of this method, you can easily put your own
auto Parents = Context.getParents(Node);
if (Parents.size() != 1) {
return nullptr;
}
// As we deal with statements, let's assume that neighbor - is the next
// statement in the enclosing compound statement.
if (auto *Parent = Parents[0].get<clang::CompoundStmt>()) {
auto Neighbor = std::adjacent_find(
Parent->body_begin(), Parent->body_end(),
[&Node](const auto *Top, const auto *Bottom) { return Top == &Node; });
if (Neighbor != Parent->body_end()) {
return *std::next(Neighbor);
}
}
return nullptr;
}
AST_MATCHER_P(clang::Stmt, neighbors, Matchers, NestedMatchers) {
// Node is the current tested node
const clang::Stmt *CurrentNode = &Node;
// Our goal is to iterate over the given matchers and match the current node
// with the first matcher.
//
// Further on, we plan on checking whether the next
// matcher matches the neighbor/sibling of the previous node.
for (auto NestedMatcher : NestedMatchers) {
// This is how one can call a matcher to test one node.
//
// NOTE: it uses Finder and Builder, so it's better to do it from
// inside of a matcher and get those for free
if (CurrentNode == nullptr or
not NestedMatcher.matches(*CurrentNode, Finder, Builder)) {
return false;
}
// Here you can put your own implementation of finding neighbor/sibling
CurrentNode = getNeighbor(*CurrentNode, Finder->getASTContext());
}
return true;
}
I hope that the comments within the snippet cover the main ideas behind this matcher.
Demo
Matcher:
neighbors({declStmt().bind("first"), forStmt().bind("second"),
returnStmt().bind("third")})
Code snippet:
int foo() {
int x = 42;
int y = 10;
for (; x > y; --x) {
}
return x;
}
Output:
first:
DeclStmt 0x4c683e0
`-VarDecl 0x4c68360 used y 'int' cinit
`-IntegerLiteral 0x4c683c0 'int' 10
second:
ForStmt 0x4c684d0
|-<<<NULL>>>
|-<<<NULL>>>
|-BinaryOperator 0x4c68468 '_Bool' '>'
| |-ImplicitCastExpr 0x4c68438 'int' <LValueToRValue>
| | `-DeclRefExpr 0x4c683f8 'int' lvalue Var 0x4c682b0 'x' 'int'
| `-ImplicitCastExpr 0x4c68450 'int' <LValueToRValue>
| `-DeclRefExpr 0x4c68418 'int' lvalue Var 0x4c68360 'y' 'int'
|-UnaryOperator 0x4c684a8 'int' lvalue prefix '--'
| `-DeclRefExpr 0x4c68488 'int' lvalue Var 0x4c682b0 'x' 'int'
`-CompoundStmt 0x4c684c0
third:
ReturnStmt 0x4c68540
`-ImplicitCastExpr 0x4c68528 'int' <LValueToRValue>
`-DeclRefExpr 0x4c68508 'int' lvalue Var 0x4c682b0 'x' 'int'
I hope that answers your question!

Illegal Argument: ParseTree error on small language

I'm stuck on this problem for a while now, hope you can help. I've got the following (shortened) language grammar:
lexical Id = [a-zA-Z][a-zA-Z]* !>> [a-zA-Z] \ MyKeywords;
lexical Natural = [1-9][0-9]* !>> [0-9];
lexical StringConst = "\"" ![\"]* "\"";
keyword MyKeywords = "value" | "Male" | "Female";
start syntax Program = program: Model* models;
syntax Model = Declaration;
syntax Declaration = decl: "value" Id name ':' Type t "=" Expression v ;
syntax Type = gender: "Gender";
syntax Expression = Terminal;
syntax Terminal = id: Id name
| constructor: Type t '(' {Expression ','}* ')'
| Gender;
syntax Gender = male: "Male"
| female: "Female";
alias ASLId = str;
data TYPE = gender();
public data PROGRAM = program(list[MODEL] models);
data MODEL = decl(ASLId name, TYPE t, EXPR v);
data EXPR = constructor(TYPE t, list[EXPR] args)
| id(ASLId name)
| male()
| female();
Now, I'm trying to parse:
value mannetje : Gender = Male
This parses fine, but fails on implode, unless I remove the id: Id name and it's constructor from the grammar. I expected that the /MyKeywords would prevent this, but unfortunately it doesn't. Can you help me fix this, or point me in the right direction to how to debug? I'm having some trouble with debugging the Concrete and Abstract syntax.
Thanks!
It does not seem to be parsing at all (I get a ParseError if I try your example).
One of the problems is probably that you don't define Layout. This causes the ParseError with you given example. One of the easiest fixes is to extend the standard Layout in lang::std::Layout. This layout defines all the default white spaces (and comment) characters.
For more information on nonterminals see here.
I took the liberty in simplifying your example a bit further so that parsing and imploding works. I removed some unused nonterminals to keep the parse tree more concise. You probably want more that Declarations in your Program but I leave that up to you.
extend lang::std::Layout;
lexical Id = ([a-z] !<< [a-z][a-zA-Z]* !>> [a-zA-Z]) \ MyKeywords;
keyword MyKeywords = "value" | "Male" | "Female" | "Gender";
start syntax Program = program: Declaration* decls;
syntax Declaration = decl: "value" Id name ':' Type t "=" Expression v ;
syntax Type = gender: "Gender";
syntax Expression
= id: Id name
| constructor: Type t '(' {Expression ','}* ')'
| Gender
;
syntax Gender
= male: "Male"
| female: "Female"
;
data PROGRAM = program(list[DECL] exprs);
data DECL = decl(str name, TYPE t, EXPR v);
data EXPR = constructor(TYPE t, list[EXPR] args)
| id(str name)
| male()
| female()
;
data TYPE = gender();
Two things:
The names of the ADTs should correspond to the nonterminal names (you have difference cases and EXPR is not Expression). That is the only way implode can now how to do its work. Put the data decls in their own module and implode as follows: implode(#AST::Program, pt) where pt is the parse tree.
The grammar was ambiguous: the \ MyKeywords only applied to the tail of the identifier syntax. Use the fix: ([a-zA-Z][a-zA-Z]* !>> [a-zA-Z]) \ MyKeywords;.
Here's what worked for me (grammar unchanged except for the fix):
module AST
alias ASLId = str;
data Type = gender();
public data Program = program(list[Model] models);
data Model = decl(ASLId name, Type t, Expression v);
data Expression = constructor(Type t, list[Expression] args)
| id(ASLId name)
| male()
| female();

Having some simple problems with Scala combinator parsers

First, the code:
package com.digitaldoodles.markup
import scala.util.parsing.combinator.{Parsers, RegexParsers}
import com.digitaldoodles.rex._
class MarkupParser extends RegexParsers {
val stopTokens = (Lit("{{") | "}}" | ";;" | ",,").lookahead
val name: Parser[String] = """[##!$]?[a-zA-Z][a-zA-Z0-9]*""".r
val content: Parser[String] = (patterns.CharAny ** 0 & stopTokens).regex
val function: Parser[Any] = name ~ repsep(content, "::") <~ ";;"
val block1: Parser[Any] = "{{" ~> function
val block2: Parser[Any] = "{{" ~> function <~ "}}"
val lst: Parser[Any] = repsep("[a-z]", ",")
}
object ParseExpr extends MarkupParser {
def main(args: Array[String]) {
println("Content regex is ", (patterns.CharAny ** 0 & stopTokens).regex)
println(parseAll(block1, "{{#name 3:4:foo;;"))
println(parseAll(block2, "{{#name 3:4:foo;; stuff}}"))
println(parseAll(lst, "a,b,c"))
}
}
then, the run results:
[info] == run ==
[info] Running com.digitaldoodles.markup.ParseExpr
(Content regex is ,(?:[\s\S]{0,})(?=(?:(?:\{\{|\}\})|;;)|\,\,))
[1.18] parsed: (#name~List(3:4:foo))
[1.24] failure: `;;' expected but `}' found
{{#name 3:4:foo;; stuff}}
^
[1.1] failure: string matching regex `\z' expected but `a' found
a,b,c
^
I use a custom library to assemble some of my regexes, so I've printed out the "content" regex; its supposed to be basically any text up to but not including certain token patterns, enforced using a positive lookahead assertion.
Finally, the problems:
1) The first run on "block1" succeeds, but shouldn't, because the separator in the "repsep" function is "::", yet ":" are parsed as separators.
2) The run on "block2" fails, presumably because the lookahead clause isn't working--but I can't figure out why this should be. The lookahead clause was already exercised in the "repsep" on the run on "block1" and seemed to work there, so why should it fail on block 2?
3) The simple repsep exercise on "lst" fails because internally, the parser engine seems to be looking for a boundary--is this something I need to work around somehow?
Thanks,
Ken
1) No, "::" are not parsed as separators. If it did, the output would be (#name~List(3, 4, foo)).
2) It happens because "}}" is also a delimiter, so it takes the longest match it can -- the one that includes ";;" as well. If you make the preceding expression non-eager, it will then fail at "s" on "stuff", which I presume is what you expected.
3) You passed a literal, not a regex. Modify "[a-z]" to "[a-z]".r and it will work.

Resources