Clang-format style rules for parentheses and access modifier - clang-format

I trying to write my own clang-format style file. There are two aspects I cannot get them right.
How do I let it keep an empty line after public:, private:, protected:? For example, I would like to have
public :
ctor () {}
Instead of
public :
ctor () {}
The second issues is that is there a way to make it insert a space before parentheses when it is following and control statement and function definition. But no space before a function call. For example, I would want,
void func () {}
func()
The SpaceBeforeParens can be only one of Never, Always, ControlStatements. The last is closest to what I want, yet it still does not work the way I want. A minor related issue is that it always remove space before the parenthesis if it follows an unary operator, for example
C &operator=(const C &);
I am more used to
C &operator= (const C &);

Related

How to use infix notation in Android properly (Kotlin language)?

I have read Kotlin docs as well as the wikipedia links (https://en.wikipedia.org/wiki/Infix_notation#:~:text=Infix%20notation%20is%20the%20notation,plus%20sign%20in%202%20%2B%202), but unfortunately I am still unable to make use of this notation in my code.
Could someone please let me know, how can I use it?
Before jumping on to the code, let’s look at the rules of it’s usage (From the docs: https://kotlinlang.org/docs/functions.html#infix-notation):
Infix notation must be used with member functions or extension functions
They must have a single parameter
The parameter must not accept variable number of arguments and must have no default value.
Keeping these pointers in mind, you can achieve the following:
fun Int.add(x: Int) = this.plus(x) //This is a simple Extension function without infix notation
infix fun Int.subtract(x: Int) = this.minus(x) //Added infix notation here
fun main() {
val sum = 10.add(20)
println(sum) //prints 30
val sub = 100 subtract 30 //Notice that there is no dot(.) and parenthesis
println(sub) //prints 70
}
This is how we can use infix notations and get rid of the dots(.) and parenthesis and they will work the same.
This increases code readability.

How do I prevent characters from the previous line from appearing when overwriting it in Rust?

I have the following code which enables me to make console output appear on the same line. However, if a value that was previously printed was of greater length than values after it, the remnants of the longer value will show up. I have seen other questions about the same thing in languages like Python, but I'm not sure how to overcome this in Rust.
Here's an example:
use std::io::prelude::*;
fn main() {
let fruits = ["Blueberry", "Orange", "Cherry", "Lemon", "Apple"];
print_value(&fruits);
}
fn print_value(e: &[&str]) {
for val in e {
print!("\rStatus: {}", val);
std::io::stdout().flush().unwrap();
// pause program temporarily
std::thread::sleep(std::time::Duration::new(2, 0));
}
}
Some terminals have a special character sequence that, when printed, clears the line to the right of the current cursor position.
VT100-compatible terminals have a character sequence EL0 for that. In Rust it can be expressed with "\x1B[K".
Here's a little thingy that might prove an example.
To do that in a more portable way you use a terminal library, such as term and it's delete_line method.

F# operator overloading strange behavoir

Let's say that for some strange reason I want to have this function:
let (~-) (str:string) = 42
So I can do something like this and get 42 as result:
-"test"
val it : int = 42
Which is fine. But now when I do:
let a = 100
-a
I get:
error FS0001: This expression was expected to have type
string
but here has type
int
Any idea why is this happening?
When you define operators using let, the new definition hides all previous definition of the operator. So in your example, you are hiding the default implementation of the unary minus (which works for numbers) and replacing it with a new operator that only works on strings.
It is not easy to re-define overloaded operators on built-in types. If you need that, it is probably better idea to avoid using operators (just use a function). However, if you want to provide an overloaded operator for a custom type, you can do this by adding operator as a static member:
type MinusString(s:string) =
member x.Value = s
/// Provide unary minus for MinusString values
static member (~-) (ms:MinusString) =
MinusString("-" + ms.Value)
-(MinusString "hi") // Returns "-hi"
If you really want to redefine built-in operator like unary minus and make it work on string, then there is actually a way to do this using a trick described in earlier SO answers. However, I would only use this if you have a good reason.
Simply, you overwrote the minus operator with one that takes a string and returns an int, then tried to apply it to an int, which it can't do anymore.

Lua source code manipulation: get innermost function() location for a given line

I've got a file with syntactically correct Lua 5.1 source code.
I've got a position (line and character offset) inside that file.
I need to get an offset in bytes to the closing parenthesis of the innermost function() body that contains that position (or figure out that the position belongs to the main chunk of the file).
I.e.:
local function foo()
^ result
print("bar")
^ input
end
local foo = function()
^ result
print("bar")
^ input
end
local foo = function()
return function()
^ result
print("bar")
^ input
end
end
...And so on.
How do I do that robustly?
EDIT: My original answer did not take into account the "innermost" requirement. I've since taken that into account
To make things "robust," there are a few considerations.
First of all, it's important that you skip over string and comment contents, to avoid incorrect output in situations like:
foo = function()
print(" function() ")
-- function()
print("bar")
^ input
end
This can be somewhat difficult, considering Lua's nested string and comment syntax. Consider, for example, a situation where the input begins in a nested string or comment:
foo = function()
print([[
bar = function()
print("baz")
^ input
end
]])
end
Consequently, if you want a completely robust system, it is not acceptable to only parse backwards until you hit the end of a function parameter list, because you may not have parsed backwards far enough to reach a [[ which would invalidate your match. It is therefore necessary to parse the entire file up to your position (unless you're okay with incorrect matches in these weird situations. If this is an editor plugin, these "incorrect" results may actually be desirable, because they would allow you to edit lua code which is stored in string literal form inside other lua code using the same plugin).
Because the particular syntax that you're trying to match doesn't have any kind of "nesting", a full-blown parser isn't needed. You will need to maintain a stack, however, to keep track of scope. With that in mind, all you need to do is step through the source file character-by-character from the beginning, applying the following logic:
Every time a " or ' is encountered, ignore the characters up to the closing " or '. Be careful to handle escapes like \" and \\
Every time a -- is encountered, ignore the characters up to the closing newline for the comment. Be careful to only do this if the comment is not a multiline comment.
Every time a multiline string opening symbol is encountered (such as [[, [=[, etc), or a multiline comment symbol is encountered (such as --[[ or --[=[, etc) ignore the characters up until the closing square brackets with the proper number of matching equals signs between them.
When a word boundary is encountered check to see if the characters after it could begin a block which ends with an end (for example, if, while, for, function, etc. DO NOT include repeat). If so, push the position on the scope stack. A "word boundary" in this case is any character which could not be used a lua identifier (this is to prevent matches in cases like abcfunction()). The beginning of the file is also considered a word boundary.
If a word boundary is encountered and it is followed by end, pop the top element of the stack. If the stack has no elements, complain about a syntax error.
When you finally step forward and reach your "input" position, pop elements from the stack until you find a function scope. Step forward from that position to the next ), ignoring )'s in comments (which could theoretically be found in an argument list if it spans multiple lines or contains inline --[[ ]] comments). That position is your result.
This should handle every case, including situations where the function syntactic sugar is used, like
function foo()
print("bar")
end
which you did not include in your example but which I imagine you still want to match.

How to convert method calls to postfix notation?

I'm writing a compiler for a javascript like language for fun. aka I'm learning about the wheel so I make one for myself and trying to find out everything but now I got stuck.
I know that shunting yard algorithm is a nice one when parsing simple infix expressions. I was able to figure out how to extend this algorithm for prefix and postfix operators too and also able to parse simple functions.
For example: 2+3*a(3,5)+b(3,5) turns into 2 3 <G> 3 5 a () * + <G> 3 5 b () +
(<G> is a guard token that is pushed on the stack it will store the return address etc. () is the call command that calls the function on the top of the stack that pops out the necessary amount of arguments and pushes back the result on return.)
If the function name is just one token I can simply mark it as function symbol if directly followed by a parenthesis. During the process if I encounter a function symbol I push it on the operator stack and pop it out when I finished converting the parameters.
This is working so far.
But if I add the option to have member functions, the . operator. The things get more tricky. For example I want to convert the a.b.c(12)+d.e.f(34) I can't mark c and f to be functions because a.b.c and d.e.f are functions. If I start my parser on an expression like this the result will be a b . <G> 12 c () . d e . <G> 34 f () . Which is obviously wrong. I want it to be <G> 12 a b . c . () <G> 34 d e . f. () Which appears correct.
But of curse I can make the things more complicated if I add some parentheses: (a.b.c)(). Or I make a function that returns a function which I call again: f(a,b)(c,d).
Is there an easy way handle these tricky situations?
A problem of your approach is that you treat object and its member as two separate tokens separated by .. Classical Shunting yard algorithm knows nothing about OOP and relies on single token for function call. So the first way to resolve you problem is to use one token for a call of an object member -- i.e. entire a.b.c must be a single token.
You may also refer to automatic parser generators for another solution of your problem. They allow to define complete grammar of your target language (JavaScript) as a set of formal rules and generate parser automatically. List of popular tools includes tools that generates parser on different programming languages: ANTLR, Bison + Lex, Lemon + Ragel.
--artem
(I saw this question is still alive. I found the solution for it myself.)
First I threat the (...) and [...] expressions as one token and expand them (recursively) when needed. Then I detect the function calls and array subscripts. If there isn't an infix operator before a parenthesized token, then that's a function call or an array subscript, so I insert a special call-function or access operator there. With this modification it works like charm.

Resources