Why is it that Lua doesn't have augmented assignment? [duplicate] - lua

This is a question I've been mildly irritated about for some time and just never got around to search the answer to.
However I thought I might at least ask the question and perhaps someone can explain.
Basically many languages I've worked in utilize syntactic sugar to write (using syntax from C++):
int main() {
int a = 2;
a += 3; // a=a+3
}
while in lua the += is not defined, so I would have to write a=a+3, which again is all about syntactical sugar. when using a more "meaningful" variable name such as: bleed_damage_over_time or something it starts getting tedious to write:
bleed_damage_over_time = bleed_damage_over_time + added_bleed_damage_over_time
instead of:
bleed_damage_over_time += added_bleed_damage_over_time
So I would like to know not how to solve this if you don't have a nice solution, in that case I would of course be interested in hearing it; but rather why lua doesn't implement this syntactical sugar.

This is just guesswork on my part, but:
1. It's hard to implement this in a single-pass compiler
Lua's bytecode compiler is implemented as a single-pass recursive descent parser that immediately generates code. It does not parse to a separate AST structure and then in a second pass convert that to bytecode.
This forces some limitations on the grammar and semantics. In particular, anything that requires arbitrary lookahead or forward references is really hard to support in this model. This means assignments are already hard to parse. Given something like:
foo.bar.baz = "value"
When you're parsing foo.bar.baz, you don't realize you're actually parsing an assignment until you hit the = after you've already parsed and generated code for that. Lua's compiler has a good bit of complexity just for handling assignments because of this.
Supporting self-assignment would make that even harder. Something like:
foo.bar.baz += "value"
Needs to get translated to:
foo.bar.baz = foo.bar.baz + "value"
But at the point that the compiler hits the =, it's already forgotten about foo.bar.baz. It's possible, but not easy.
2. It may not play nice with the grammar
Lua doesn't actually have any statement or line separators in the grammar. Whitespace is ignored and there are no mandatory semicolons. You can do:
io.write("one")
io.write("two")
Or:
io.write("one") io.write("two")
And Lua is equally happy with both. Keeping a grammar like that unambiguous is tricky. I'm not sure, but self-assignment operators may make that harder.
3. It doesn't play nice with multiple assignment
Lua supports multiple assignment, like:
a, b, c = someFnThatReturnsThreeValues()
It's not even clear to me what it would mean if you tried to do:
a, b, c += someFnThatReturnsThreeValues()
You could limit self-assignment operators to single assignment, but then you've just added a weird corner case people have to know about.
With all of this, it's not at all clear that self-assignment operators are useful enough to be worth dealing with the above issues.

I think you could just rewrite this question as
Why doesn't <languageX> have <featureY> from <languageZ>?
Typically it's a trade-off that the language designers make based on their vision of what the language is intended for, and their goals.
In Lua's case, the language is intended to be an embedded scripting language, so any changes that make the language more complex or potentially make the compiler/runtime even slightly larger or slower may go against this objective.
If you implement each and every tiny feature, you can end up with a 'kitchen sink' language: ADA, anyone?
And as you say, it's just syntactic sugar.

Another reason why Lua doesn't have self-assignment operators is that table access can be overloaded with metatables to have arbitrary side effects. For self assignment you would need to choose to desugar
foo.bar.baz += 2
into
foo.bar.baz = foo.bar.baz + 2
or into
local tmp = foo.bar
tmp.baz = tmp.baz + 2
The first version runs the __index metamethod for foo twice, while the second one does so only once. Not including self-assignment in the language and forcing you to be explicit helps avoid this ambiguity.

Related

what is the use of cond and let reserve words in erlang?

what is the use of "cond" and "let" reserved words in erlang, i saw them in reserved words list but never seen any use in erlang or any example of them in erlang, i also search on google but got nothing.
It seems to me that they are just reserved, just like "goto" and "const" in java but i may be wrong. so please explain those keywords with little examples as well.
-module(hello_world).
start() ->
io:format("hello_world").
I believe they are reserved for the same reason goto is reserved in Java: in this paradigm the terms cond and let have rather specific meanings, and it would be confusing to have programmers use those atoms unquoted.
It also may be that, given the way scoping works with list comprehensions and the general desire to avoid nested case statements (and aversion to if in most cases) it may be useful to use cond or let in the future.
There are two EEPs that mention these: EEP12 and EEP25
EEP12 mentione let as part of a larger discussion, citing Haskell lets.
EEP25 mentions that with a generalized case a previous recommendation for a generalized cond would be unnecessary -- but I don't know what recommendation this refers to (maybe a previous draft of the EEP).
In any case, reserving a word is easy to do up-front -- it is never a breaking change to reserve a word early, or to release it sometime later after you're sure you don't need it. Reserving a word that was previously unreserved, though, is a recipe for disaster. It is arguable worse to suddenly find that you really need a term like let, but can't use it because its already important in other programs people have -- and so you wind up having to make up some nonstandard term people can't remember and is confusing to newcomers to the language who are already familiar with the semantics of the basic paradigm.

Is there an idiomatic way to order function arguments in Erlang?

Seems like it's inconsistent in the lists module. For example, split has the number as the first argument and the list as the second, but sublists has the list as the first argument and the len as the second argument.
OK, a little history as I remember it and some principles behind my style.
As Christian has said the libraries evolved and tended to get the argument order and feel from the impulses we were getting just then. So for example the reason why element/setelement have the argument order they do is because it matches the arg/3 predicate in Prolog; logical then but not now. Often we would have the thing being worked on first, but unfortunately not always. This is often a good choice as it allows "optional" arguments to be conveniently added to the end; for example string:substr/2/3. Functions with the thing as the last argument were often influenced by functional languages with currying, for example Haskell, where it is very easy to use currying and partial evaluation to build specific functions which can then be applied to the thing. This is very noticeable in the higher order functions in lists.
The only influence we didn't have was from the OO world. :-)
Usually we at least managed to be consistent within a module, but not always. See lists again. We did try to have some consistency, so the argument order in the higher order functions in dict/sets match those of the corresponding functions in lists.
The problem was also aggravated by the fact that we, especially me, had a rather cavalier attitude to libraries. I just did not see them as a selling point for the language, so I wasn't that worried about it. "If you want a library which does something then you just write it" was my motto. This meant that my libraries were structured, just not always with the same structure. :-) That was how many of the initial libraries came about.
This, of course, creates unnecessary confusion and breaks the law of least astonishment, but we have not been able to do anything about it. Any suggestions of revising the modules have always been met with a resounding "no".
My own personal style is a usually structured, though I don't know if it conforms to any written guidelines or standards.
I generally have the thing or things I am working on as the first arguments, or at least very close to the beginning; the order depends on what feels best. If there is a global state which is chained through the whole module, which there usually is, it is placed as the last argument and given a very descriptive name like St0, St1, ... (I belong to the church of short variable names). Arguments which are chained through functions (both input and output) I try to keep the same argument order as return order. This makes it much easier to see the structure of the code. Apart from that I try to group together arguments which belong together. Also, where possible, I try to preserve the same argument order throughout a whole module.
None of this is very revolutionary, but I find if you keep a consistent style then it is one less thing to worry about and it makes your code feel better and generally be more readable. Also I will actually rewrite code if the argument order feels wrong.
A small example which may help:
fubar({f,A0,B0}, Arg2, Ch0, Arg4, St0) ->
{A1,Ch1,St1} = foo(A0, Arg2, Ch0, St0),
{B1,Ch2,St2} = bar(B0, Arg4, Ch1, St1),
Res = baz(A1, B1),
{Res,Ch2,St2}.
Here Ch is a local chained through variable while St is a more global state. Check out the code on github for LFE, especially the compiler, if you want a longer example.
This became much longer than it should have been, sorry.
P.S. I used the word thing instead of object to avoid confusion about what I was talking.
No, there is no consistently-used idiom in the sense that you mean.
However, there are some useful relevant hints that apply especially when you're going to be making deeply recursive calls. For instance, keeping whichever arguments will remain unchanged during tail calls in the same order/position in the argument list allows the virtual machine to make some very nice optimizations.

When is an ambiguous grammar or production rule OK? (bison shift/reduce warnings)

There are certainly plenty of docs and howtos on resolving shift/reduce errors. The bison docs suggest the correct solution is usually to just %expect them and deal with it.
When you have things like this:
S: S 'b' S | 't'
You can easily resolve them like this:
S: S 'b' T | T
T: 't'
My question is: Is it better to leave the grammar a touch ambiguous and %expect shift/reduce problems or is it better to try to adjust the grammar to avoid them? I suspect there is a balance and it's based on the needs of the author, but I don't really know.
As I read it, Your question is "When is an ambiguous grammar or production rule OK?"
First consider the language you are describing. What would be the implication of allowing an ambiguous production rule into the language.
Your example describes a language which might include an expression like: t b t b t b t
The expression, resolved as in your second example would be (((( t ) b t) b t ) b t ) but in an ambiguous grammer it could also become ( t b ( t b ( t b ( t)))) or even ( t b t ) b ( t b t ). Which could be valid might depend on the language. If the b operator models subtraction, it really shouldn't be ambiguous, but if it was addition, it might be ok. This really depends on the language.
The second question to consider is what the resulting grammar source file ends up looking like, after the conflicts are resolved. As with other source code, a grammar is meant to be read by humans, and secondarily also by computers. Prefer a notation that gives a clearer explanation of what the parser is trying to do from the grammar. That is, if the parser is executing some possibly undefined behavior, for example, order of evaluation of a function's arguments in an eager language, make the grammar look ambiguous.
You can guide the conflict resolution with operator precedence. Declare 'b' as an left- or right-associative operator and you have covered at least that case.
For more complex patterns, as long as the final parser produces the correct result in all cases, the warnings isn't much to worry about. Though if you can't get it to give the correct result using declarations you would have to rewrite the grammar.
In my compiler course last semester we used bison, and built a compiler for a subset of pascal.
If the language is complex enough, you will have some errors. As long as you understand why they are there, and what you'd have to do to remove them, we found it to be alright. If something was there, but due to the behaviour would work as we wanted it to, and would require much to much thought and work to make it worth while (and also complicating the grammar), we left it alone. Just make sure you fully understand the error, and document it somewhere (even for yourself), so that you always know what's going on with it.
It's a cost/benefit analysis once things get really involved, but IMHO, fixing it should be considered FIRST, then actually figure out what the work would be (and if that work breaks something else, or makes something else harder), and go from there. Never pass them off as commonplace.
When I need to prove that a grammar is unambiguous, I tend to write it first as a Parsing Expression Grammar, and then convert it by hand to whatever grammar type the tool set I'm using for the project needs. In my experience, the need for this level of proof is very rare, though, since most shift/reduce conflicts I have come across have been fairly trivial ones to show the correctness of (on the order of your example).

Any reason I couldn't create a language supporting infix, postfix, and prefix functions, and more?

I've been mulling over creating a language that would be extremely well suited to creation of DSLs, by allowing definitions of functions that are infix, postfix, prefix, or even consist of multiple words. For example, you could define an infix multiplication operator as follows (where multiply(X,Y) is already defined):
a * b => multiply(a,b)
Or a postfix "squared" operator:
a squared => a * a
Or a C or Java-style ternary operator, which involves two keywords interspersed with variables:
a ? b : c => if a==true then b else c
Clearly there is plenty of scope for ambiguities in such a language, but if it is statically typed (with type inference), then most ambiguities could be eliminated, and those that remain could be considered a syntax error (to be corrected by adding brackets where appropriate).
Is there some reason I'm not seeing that would make this extremely difficult, impossible, or just a plain bad idea?
Edit: A number of people have pointed me to languages that may do this or something like this, but I'm actually interested in pointers to how I could implement my own parser for it, or problems I might encounter if doing so.
This is not too hard to do. You'll want to assign each operator a fixity (infix, prefix, or postfix) and a precedence. Make the precedence a real number; you'll thank me later. Operators of higher precedence bind more tightly than operators of lower precedence; at equal levels of precedence, you can require disambiguation with parentheses, but you'll probably prefer to permit some operators to be associative so you can write
x + y + z
without parentheses. Once you have a fixity, a precedence, and an associativity for each operator, you'll want to write an operator-precedence parser. This kind of parser is fairly simply to write; it scans tokens from left to right and uses one auxiliary stack. There is an explanation in the dragon book but I have never found it very clear, in part because the dragon book describes a very general case of operator-precedence parsing. But I don't think you'll find it difficult.
Another case you'll want to be careful of is when you have
prefix (e) postfix
where prefix and postfix have the same precedence. This case also requires parentheses for disambiguation.
My paper Unparsing Expressions with Prefix and Postfix Operators has an example parser in the back, and you can download the code, but it's written in ML, so its workings may not be obvious to the amateur. But the whole business of fixity and so on is explained in great detail.
What are you going to do about order of operations?
a * b squared
You might want to check out Scala which has a kind of unique approach to operators and methods.
Haskell has just what you're looking for.

Are there well known algorithms for deducing the "return types" of parser rules?

Given a grammar and the attached action code, are there any standard solution for deducing what type each production needs to result in (and consequently, what type the invoking production should expect to get from it)?
I'm thinking of an OO program and action code that employs something like c#'s var syntax (but I'm not looking for something that is c# specific).
This would be fairly simple if it were not for function overloading and recursive grammars.
The issue arises with cases like this:
Foo ::=
Bar Baz { return Fig(Bar, Baz); }
Foo Pit { return Pop(Foo, Pit); } // typeof(foo) = fn(typeof(Foo))
If you are writing code in a functional language it is easy; standard Hindley-Milner type inference works great. Do not do this. In my EBNF parser generator (never released but source code available on request), which supports Icon, c, and Standard ML, I actually implemented the idea you are asking about for the Standard ML back end: all the types were inferred. The resulting grammars were nearly impossible to debug.
If you throw overloading into the mix, the results are only going to be harder to debug. (That's right! This just in! Harder than impossible! Greater than infinity! Past my bedtime!) If you really want to try it yourself you're welcome to my code. (You don't; there's a reason I never released it.)
The return value of a grammar action is really no different from a local variable, so you should be able to use C# type inference to do the job. See this paper for some insight into how C# type inference is implemented.
The standard way of doing type inference is the Hindley-Milner algorithm, but that will not handle overloading out-of-the-box.
Note that even parser generators for type-inferencing languages don't typically infer the types of grammar actions. For example, ocamlyacc requires type annotations. The Happy parser generator for Haskell can infer types, but seems to discourage the practice. This might indicate that inferring types in grammars is difficult, a bad idea, or both.
[UPDATE] Very much pwned by Norman Ramsey, who has the benefit of bitter experience.

Resources