clang-format function argument indent - clang-format

Is there a way to configure clang-format to behave in this way? Notice how each parameter is on its own line and is indented by only 1 level as opposed to aligned with the function name. This is my preferred coding style, but I just can't get clang-format to do this.
int a_very_long_function_name(
int var_a,
double my_b,
char *str
) {
int d = 0;
/* ... */
return d;
}

As far as I know, this is not possible.
You can try all the different parameters with this nifty tool: https://zed0.co.uk/clang-format-configurator/

Should be possible now according to: https://clang.llvm.org/docs/ClangFormatStyleOptions.html
AlignAfterOpenBracket: BlockIndent

Related

How to add local variables to yylex function in flex lexer?

I was writing a lexer file that matches simple custom delimited strings of the form xyz$this is stringxyz. This is nearly how I did it:
%{
char delim[16];
uint8_t dlen;
%}
%%
.*$ {
dlen = yyleng-1;
strncpy(delim, yytext, dlen);
BEGIN(STRING);
}
<STRING>. {
if(yyleng >= dlen) {
if(strncmp(delim, yytext[yyleng-dlen], dlen) == 0) {
BEGIN(INITIAL);
return STR;
}
}
yymore();
}
%%
Now I wanted to convert this to reentrant lexer. But I don't know how to make delim and dlen as local variables inside yylex apart from modifying generated lexer. Someone please help me how should I do this.
I don't recommend to store these in yyextra because, these variables need not persist across multiple calls to yylex. Hence I would prefer an answer that guides me towards declaring these as local variables.
In the (f)lex file, any indented lines between the %% and the first rule are copied verbatim into yylex() prior to the first statement, precisely to allow you to declare and initialize local variables.
This behaviour is guaranteed by the Posix specification; it is not a flex extension: (emphasis added)
Any such input (beginning with a <blank>or within "%{" and "%}" delimiter lines) appearing at the beginning of the Rules section before any rules are specified shall be written to lex.yy.c after the declarations of variables for the yylex() function and before the first line of code in yylex(). Thus, user variables local to yylex() can be declared here, as well as application code to execute upon entry to yylex().
A similar statement is in the Flex manual section 5.2, Format of the Rules Section
The strategy you propose will work, certainly, but it's not very efficient. You might want to consider using input() to read characters one at a time, although that's not terribly efficient either. In any event, delim is unnecessary:
%%
int dlen;
[^$\n]{1,16}\$ {
dlen = yyleng-1;
yymore();
BEGIN(STRING);
}
<STRING>. {
if(yyleng > dlen * 2) {
if(memcmp(yytext, yytext + yyleng - dlen, dlen) == 0) {
/* Remove the delimiter from the reported value of yytext. */
yytext += dlen + 1;
yyleng -= 2 * dlen + 1;
yytext[yyleng] = 0;
return STR;
}
}
yymore();
}
%%

Clang-format struct initialization - indent with two spaces?

I'm trying to format a C file with clang-format. I want indentations to be two space characters, which mostly works except for in global variable struct initialization. For this, it continues to produce lines which are indented with four spaces.
Here is my .clang-format file
BasedOnStyle: LLVM
ColumnLimit: '80'
IndentWidth: '2'
clang-format produces this surprising output
typedef struct {
int x;
} foo;
static foo bar{
1, // This line is being indented with 4 spaces!
};
And this is what I'd expect the file to look like:
typedef struct {
int x;
} foo;
static foo bar{
1, // This line is being indented with 2 spaces!
};
I've tried using a few different values for ConstructorInitializerIndentWidth, but that field doesn't appear to affect this pattern.
Is there a setting that I could provide to get this behavior?
Try ContinuationIndentWidth: 2. That worked for me, when I had that problem.

How do I find the SourceLocation of the commas between function arguments using libtooling?

My main goal is trying to get macros (or even just the text) before function parameters. For example:
void Foo(_In_ void* p, _Out_ int* x, _Out_cap_(2) int* y);
I need to gracefully handle things like macros that declare parameters (by ignoring them).
#define Example _In_ int x
void Foo(Example);
I've looked at Preprocessor record objects and used Lexer::getSourceText to get the macro names In, Out, etc, but I don't see a clean way to map them back to the function parameters.
My current solution is to record all the macro expansions in the file and then compare their SourceLocation to the ParamVarDecl SourceLocation. This mostly works except I don't know how to skip over things after the parameter.
void Foo(_In_ void* p _Other_, _In_ int y);
Getting the SourceLocation of the comma would work, but I can't find that anywhere.
The title of the questions asks for libclang, but as you use Lexer::getSourceText I assume that it's libTooling. The rest of my answer is viable only in terms of libTooling.
Solution 1
Lexer works on the level of tokens. Comma is also a token, so you can take the end location of a parameter and fetch the next token using Lexer::findNextToken.
Here is a ParmVarDecl (for function parameters) and CallExpr (for function arguments) visit functions that show how to use it:
template <class T> void printNextTokenLocation(T *Node) {
auto NodeEndLocation = Node->getSourceRange().getEnd();
auto &SM = Context->getSourceManager();
auto &LO = Context->getLangOpts();
auto NextToken = Lexer::findNextToken(NodeEndLocation, SM, LO);
if (!NextToken) {
return;
}
auto NextTokenLocation = NextToken->getLocation();
llvm::errs() << NextTokenLocation.printToString(SM) << "\n";
}
bool VisitParmVarDecl(ParmVarDecl *Param) {
printNextTokenLocation(Param);
return true;
}
bool VisitCallExpr(CallExpr *Call) {
for (auto *Arg : Call->arguments()) {
printNextTokenLocation(Arg);
}
return true;
}
For the following code snippet:
#define FOO(x) int x
#define BAR float d
#define MINUS -
#define BLANK
void foo(int a, double b ,
FOO(c) , BAR) {}
int main() {
foo( 42 ,
36.6 , MINUS 10 , BLANK 0.0 );
return 0;
}
it produces the following output (six locations for commas and two for parentheses):
test.cpp:6:15
test.cpp:6:30
test.cpp:7:19
test.cpp:7:24
test.cpp:10:17
test.cpp:11:12
test.cpp:11:28
test.cpp:11:43
This is quite a low-level and error-prone approach though. However, you can change the way you solve the original problem.
Solution 2
Clang stores information about expanded macros in its source locations. You can find related methods in SourceManager (for example, isMacroArgExpansion or isMacroBodyExpansion). As the result, you can visit ParmVarDecl nodes and check their locations for macro expansions.
I would strongly advice moving in the second direction.
I hope this information will be helpful. Happy hacking with Clang!
UPD speaking of attributes, unfortunately, you won't have a lot of choices. Clang does ignore any unknown attribute and this behaviour is not tweakable. If you don't want to patch Clang itself and add your attributes to Attrs.td, then you're limited indeed to tokens and the first approach.

clang-format: Align asterisk (*) of pointer declaration with variable name

I am using the following options in my .clang-format file:
AlignConsecutiveDeclarations: true
PointerAlignment: Right
The current formatting result is the following:
char * var1;
SomeOtherType *var2;
int var3;
The result I was expecting would be:
char *var1; //note the changed position of *
SomeOtherType *var2;
int var3;
How can I configure clang-format to align the asterix (*) with the variable name rather then with the type when I am
using the AlignConsecutiveDeclarations option?
PointerAlignment: Right is unfortunately not implemented yet.
See https://github.com/llvm/llvm-project/blob/master/clang/lib/Format/WhitespaceManager.cpp#L643
void WhitespaceManager::alignConsecutiveDeclarations() {
if (!Style.AlignConsecutiveDeclarations)
return;
// FIXME: Currently we don't handle properly the PointerAlignment: Right
// The * and & are not aligned and are left dangling. Something has to be done
// about it, but it raises the question of alignment of code like:
// const char* const* v1;
// float const* v2;
// SomeVeryLongType const& v3;
AlignTokens(Style, [](Change const &C) { return C.IsStartOfDeclName; },
Changes);
}
It's fixed now!
The review https://reviews.llvm.org/D27651 has been re-applied in https://reviews.llvm.org/D103245 and committed in https://reviews.llvm.org/rG3e333cc82e42e1e2ecc974d896489eebe1a5edc2.
This change will be included in LLVM 13 release.

How to find functions in a cpp file that contain a specific word

using grep, vim's grep, or another unix shell command, I'd like to find the functions in a large cpp file that contain a specific word in their body.
In the files that I'm working with the word I'm looking for is on an indented line, the corresponding function header is the first line above the indented line that starts at position 0 and is not a '{'.
For example searching for JOHN_DOE in the following code snippet
int foo ( int arg1 )
{
/// code
}
void bar ( std::string arg2 )
{
/// code
aFunctionCall( JOHN_DOE );
/// more code
}
should give me
void bar ( std::string arg2 )
The algorithm that I hope to catch in grep/vim/unix shell scripts would probably best use the indentation and formatting assumptions, rather than attempting to parse C/C++.
Thanks for your suggestions.
I'll probably get voted down for this!
I am an avid (G)VIM user but when I want to review or understand some code I use Source Insight. I almost never use it as an actual editor though.
It does exactly what you want in this case, e.g. show all the functions/methods that use some highlighted data type/define/constant/etc... in a relations window...
(source: sourceinsight.com)
Ouch! There goes my rep.
As far as I know, this can't be done. Here's why:
First, you have to search across lines. No problem, in vim adding a _ to a character class tells it to include new lines. so {_.*} would match everything between those brackets across multiple lines.
So now you need to match whatever the pattern is for a function header(brittle even if you get it to work), then , and here's the problem, whatever lines are between it and your search string, and finally match your search string. So you might have a regex like
/^\(void \+\a\+ *(.*)\)\_.*JOHN_DOE
But what happens is the first time vim finds a function header, it starts matching. It then matches every character until it finds JOHN_DOE. Which includes all the function headers in the file.
So the problem is that, as far as I know, there's no way to tell vim to match every character except for this regex pattern. And even if there was, a regex is not the tool for this job. It's like opening a beer with a hammer. What we should do is write a simple script that gives you this info, and I have.
fun! FindMyFunction(searchPattern, funcPattern)
call search(a:searchPattern)
let lineNumber = line(".")
let lineNumber = lineNumber - 1
"call setpos(".", [0, lineNumber, 0, 0])
let lineString = getline(lineNumber)
while lineString !~ a:funcPattern
let lineNumber = lineNumber - 1
if lineNumber < 0
echo "Function not found :/"
endif
let lineString = getline(lineNumber)
endwhile
echo lineString
endfunction
That should give you the result you want and it's way easier to share, debug, and repurpose than a regular expression spit from the mouth of Cthulhu himself.
Tough call, although as a starting point I would suggest this wonderful VIM Regex Tutorial.
You cannot do that reliably with a regular expression, because code is not a regular language. You need a real parser for the language in question.
Arggh! I admit this is a bit over the top:
A little program to filter stdin, strip comments, and put function bodies on the same line. It'll get fooled by namespaces and function definitions inside class declarations, besides other things. But it might be a good start:
#include <stdio.h>
#include <assert.h>
int main() {
enum {
NORMAL,
LINE_COMMENT,
MULTI_COMMENT,
IN_STRING,
} state = NORMAL;
unsigned depth = 0;
for(char c=getchar(),prev=0; !feof(stdin); prev=c,c=getchar()) {
switch(state) {
case NORMAL:
if('/'==c && '/'==prev)
state = LINE_COMMENT;
else if('*'==c && '/'==prev)
state = MULTI_COMMENT;
else if('#'==c)
state = LINE_COMMENT;
else if('\"'==c) {
state = IN_STRING;
putchar(c);
} else {
if(('}'==c && !--depth) || (';'==c && !depth)) {
putchar(c);
putchar('\n');
} else {
if('{'==c)
depth++;
else if('/'==prev && NORMAL==state)
putchar(prev);
else if('\t'==c)
c = ' ';
if(' '==c && ' '!=prev)
putchar(c);
else if(' '<c && '/'!=c)
putchar(c);
}
}
break;
case LINE_COMMENT:
if(' '>c)
state = NORMAL;
break;
case MULTI_COMMENT:
if('/'==c && '*'==prev) {
c = '\0';
state = NORMAL;
}
break;
case IN_STRING:
if('\"'==c && '\\'!=prev)
state = NORMAL;
putchar(c);
break;
default:
assert(!"bug");
}
}
putchar('\n');
return 0;
}
Its c++, so just it in a file, compile it to a file named 'stripper', and then:
cat my_source.cpp | ./stripper | grep JOHN_DOE
So consider the input:
int foo ( int arg1 )
{
/// code
}
void bar ( std::string arg2 )
{
/// code
aFunctionCall( JOHN_DOE );
/// more code
}
The output of "cat example.cpp | ./stripper" is:
int foo ( int arg1 ) { }
void bar ( std::string arg2 ){ aFunctionCall( JOHN_DOE ); }
The output of "cat example.cpp | ./stripper | grep JOHN_DOE" is:
void bar ( std::string arg2 ){ aFunctionCall( JOHN_DOE ); }
The job of finding the function name (guess its the last identifier to precede a "(") is left as an exercise to the reader.
For that kind of stuff, although it comes to primitive searching again, I would recommend compview plugin. It will open up a search window, so you can see the entire line where the search occured and automatically jump to it. Gives a nice overview.
(source: axisym3.net)
Like Robert said Regex will help. In command mode start a regex search by typing the "/" character followed by your regex.
Ctags1 may also be of use to you. It can generate a tag file for a project. This tag file allows a user to jump directly from a function call to it's definition even if it's in another file using "CTRL+]".
u can use grep -r -n -H JOHN_DOE * it will look for "JOHN_DOE" in the files recursively starting from the current directory
you can use the following code to practically find the function which contains the text expression:
public void findFunction(File file, String expression) {
Reader r = null;
try {
r = new FileReader(file);
} catch (FileNotFoundException ex) {
ex.printStackTrace();
}
BufferedReader br = new BufferedReader(r);
String match = "";
String lineWithNameOfFunction = "";
Boolean matchFound = false;
try {
while(br.read() > 0) {
match = br.readLine();
if((match.endsWith(") {")) ||
(match.endsWith("){")) ||
(match.endsWith("()")) ||
(match.endsWith(")")) ||
(match.endsWith("( )"))) {
// this here is because i guessed that method will start
// at the 0
if((match.charAt(0)!=' ') && !(match.startsWith("\t"))) {
lineWithNameOfFunction = match;
}
}
if(match.contains(expression)) {
matchFound = true;
break;
}
}
if(matchFound)
System.out.println(lineWithNameOfFunction);
else
System.out.println("No matching function found");
} catch (IOException ex) {
ex.printStackTrace();
}
}
i wrote this in JAVA, tested it and works like a charm. has few drawbacks though, but for starters it's fine. didn't add support for multiple functions containing same expression and maybe some other things. try it.

Resources