clarification regarding convertion of postfix - stack

So this expression, A - B + C * (D / E) results to A B - C D E / * +
I thought when converting infix to postfix, you have to keep the operators in stack until you see operators with lower precedence after it or at the end of the expression and you just write it all down.
But minus sign has the same level of precedence as the plus, so why minus sign is written down and not keeping it to stack?
I think there's something wronf with my understanding with the method of converting. Please provide an explanation, preferably with a step by step solution. I am really confused what happens to the minus sign as I understand the conversion differently. Thank you so much.

The Shunting yard algorithm rules state:
if the token is an operator, then:
while ((there is a function at the top of the operator stack)
or (there is an operator at the top of the operator stack with greater precedence)
or (the operator at the top of the operator stack has equal precedence and is left associative))
and (the operator at the top of the operator stack is not a left parenthesis):
pop operators from the operator stack onto the output queue.
Note the second or condition: the operator at the top of the operator stack has equal precedence and is left associative.
The + and - operators have equal precedence, and are left-associative. Therefore, you would remove the - operator when you see the +.
See also https://www.geeksforgeeks.org/operator-precedence-and-associativity-in-c/. Although that's specific to C, most languages use the same precedence and associativity for common operators.

Related

How is the conditional operator parsed?

So, the cppreference claims:
The expression in the middle of the conditional operator (between ? and :) is parsed as if parenthesized: its precedence relative to ?: is ignored.
However, it appears to me that the part of the expression after the ':' operator is also parsed as if it were between parentheses. I've tried to implement the ternary operator in my programming language (and you can see the results of parsing expressions here), and my parser pretends that the part of the expression after ':' is also parenthesized. For example, for the expression (1?1:0?2:0)-1, the interpreter for my programming language outputs 0, and this appears to be compatible with C. For instance, the C program:
#include <stdio.h>
int main() {
printf("%d\n",(1?1:0?2:0)-1);
}
Outputs 0.
Had I programmed the parser of my programming language that, when parsing the ternary operators, simply take the first already parsed node after ':' and take it as the third operand to '?:', it would output the same as ((1?1:0)?2:0)-1, that is 1.
My question is whether this would (pretending that the expression after the ':' is parenthesized) always be compatible with C?
"Pretends that it is parenthesised" is some kind of description of operator parenthesis. But of course that has to be interpreted relative to precedence relations (including associativity). So in a-b*c and a*b-c, the subtraction effectively acts as though its arguments are parenthesised, only the left-hand argument is treated that way in a-b-c and it is the comparison operator which causes grouping in a<b-c and a-b<c.
I'm sure you know all that since your parser seems to work for all these cases, but I say that because the ternary operator is right-associative and of lower precedence than any other operator [Note 1]. That means that the pseudo-parentheses imposed by operator precedence surround the right-hand argument (regardless of its dominating operator, since all operators have higher precedence), and also the left-hand argument unless its dominating operator is another conditional operator. But that wouldn't be the case in C, where the comma operator has lower precedence and would not be enclosed by the imaginary parentheses following the :.
It's important to understand what is meant by the precedence of a complex operator. In effect, to compute the precedence relations we first collapse the operator to a simple ?: which includes the enclosed (second) argument. This is not "as if the expression were parenthesized", because it is parenthesized. It is parenthesized between ? and :, which in this context are syntactically parenthetic.
In this sense, it is very similar to the usual analysis of the subscript operator as a postfix operator, although the brackets of the subscript operator enclose a second argument. The precedence of the subscript operator is logically what would result from considering it to be a single [], abstracting away the expression contained inside. This is also the same as the function call operator. That happens to be written with parentheses, but the precise symbols are not important: it is possible to imagine an alternative language in which function calls are written with different symbols, perhaps { and }. That wouldn't affect the grammar at all.
It might seem odd to think of ? and : to be "parenthetic", since they don't look parenthetic. But a parser doesn't see the shapes of the symbols. It is satisfied by being told that a ( is closed by a ) and, in this case, that a ? is closed by a :. [Note 2]
Having said all that, I tried your compiler on the conditional expression
d = 0 ? 0 : n / d
It parses this expression correctly, but the compiled code computes n / d before verifying whether d = 0 is true. That's not the way the conditional operator should work; in this case, it will lead to an unexpected divide by 0 exception. The conditional operator must first evaluate its left-hand argument, and then evaluate exactly one of the other two expressions.
Notes:
In C, this is not quite correct. The comma operator has lower precedence, and there is a more complex interaction with assignment operators, which logically have the same precedence and are also right-associative.
In C-like languages those symbols are not used for any other purpose, so it's OK to just regard them as strange-looking parentheses and leave it at that. But as the case of the function-call operator shows (or, for that matter, the unary - operator), it is sometimes possible to reuse operator symbols for more than one purpose.
As a curiosity, it is not strictly necessary that open and close parentheses be different symbols, as long as they are not used for any other purpose. So, for example, if | is not used as an operator symbol (as it is in C), then you could use | a | to mean the absolute value of a without creating any ambiguities.
A precise analysis of the circumstances in which symbol reuse leads to actual ambiguities is beyond the scope of this answer.

Convert Arithmetic Expression Tree to string without unnecessary parenthesis?

Assume I build a Abstract Syntax Tree of simple arithmetic operators, like Div(left,right), Add(left,right), Prod(left,right),Sum(left,right), Sub(left,right).
However when I want to convert the AST to string, I found it is hard to remove those unnecessary parathesis.
Notice the output string should follow the normal math operator precedence.
Examples:
Prod(Prod(1,2),Prod(2,3)) let's denote this as ((1*2)*(2,3))
make it to string, it should be 1*2*2*3
more examples:
(((2*3)*(3/5))-4) ==> 2*3*3/5 - 4
(((2-3)*((3*7)/(1*5))-4) ==> (2-3)*3*7/(1*5) - 4
(1/(2/3))/5 ==> 1/(2/3)/5
((1/2)/3))/5 ==> 1/2/3/5
((1-2)-3)-(4-6)+(1-3) ==> 1-2-3-(4-6)+1-3
I find the answer in this question.
Although the question is a little bit different from the link above, the algorithm still applies.
The rule is: if the children of the node has lower precedence, then a pair of parenthesis is needed.
If the operator of a node one of -, /, %, and if the right operand equals its parent node's precedence, it also needs parenthesis.
I give the pseudo-code (scala like code) blow:
def toString(e:Expression, parentPrecedence:Int = -1):String = {
e match {
case Sub(left2,right2) =>
val p = 10
val left = toString(left2, p)
val right = toString(right, p + 1) // +1 !!
val op = "-"
lazy val s2 = left :: right :: Nil mkString op
if (parentPrecedence > p )
s"($s2)"
else s"$s2"
//case Modulus and divide is similar to Sub except for p
case Sum(left2,right2) =>
val p = 10
val left = toString(left2, p)
val right = toString(right, p) //
val op = "-"
lazy val s2 = left :: right :: Nil mkString op
if (parentPrecedence > p )
s"($s2)"
else s"$s2"
//case Prod is similar to Sum
....
}
}
For a simple expression grammar, you can eliminate (most) redundant parentheses using operator precedence, essentially the same way that you parse the expression into an AST.
If you're looking at a node in an AST, all you need to do is to compare the precedence of the node's operator with the precedence of the operator for the argument, using the operator's associativity in the case that the precedences are equal. If the node's operator has higher precedence than an argument, the argument does not need to be surrounded by parentheses; otherwise it needs them. (The two arguments need to be examined independently.) If an argument is a literal or identifier, then of course no parentheses are necessary; this special case can easily be handled by making the precedence of such values infinite (or at least larger than any operator precedence).
However, your example includes another proposal for eliminating redundant parentheses, based on the mathematical associativity of the operator. Unfortunately, mathematical associativity is not always applicable in a computer program. If your expressions involving floating point numbers, for example, a+(b+c) and (a+b)+c might have very different values:
(gdb) p (1000000000000000000000.0 + -1000000000000000000000.0) + 2
$1 = 2
(gdb) p 1000000000000000000000.0 + (-1000000000000000000000.0 + 2)
$2 = 0
For this reason, it's pretty common for compilers to avoid rearranging the order of application of multiplication and addition, at least for floating point arithmetic, and also for integer arithmetic in the case of languages which check for integer overflow.
But if you do really want to rearrange based on mathematical associativity, you'll need an additional check during the walk of the AST; before checking precedence, you'll want to check whether the node you're visiting and its left argument use the same operator, where that operator is known to be mathematically associative. (This assumes that only operators which group to the left are mathematically associative. In the unlikely case that you have a mathematically associative operator which groups to the right, you'll want to check the visited node and its right-hand argument.)
If that condition is met, you can rotate the root of the AST, turning (for example) PROD(PROD(a,b),□)) into PROD(a,PROD(b,□)). That might lead to additional rotations in the case that a is also a PROD node.

Left recursion parsing

Description:
While reading Compiler Design in C book I came across the following rules to describe a context-free grammar:
a grammar that recognizes a list of one or more statements, each of
which is an arithmetic expression followed by a semicolon. Statements are made up of a
series of semicolon-delimited expressions, each comprising a series of numbers
separated either by asterisks (for multiplication) or plus signs (for addition).
And here is the grammar:
1. statements ::= expression;
2. | expression; statements
3. expression ::= expression + term
4. | term
5. term ::= term * factor
6. | factor
7. factor ::= number
8. | (expression)
The book states that this recursive grammar has a major problem. The right hand side of several productions appear on the left-hand side as in production 3 (And this property is called left recursion) and certain parsers such as recursive-descent parser can't handle left-recursion productions. They just loop forever.
You can understand the problem by considering how the parser decides to apply a particular production when it is replacing a non-terminal that has more than one right hand side. The simple case is evident in Productions 7 and 8. The parser can choose which production to apply when it's expanding a factor by looking at the next input symbol. If this symbol is a number, then the compiler applies Production 7 and replaces the factor with a number. If the next input symbol was an open parenthesis, the parser
would use Production 8. The choice between Productions 5 and 6 cannot be solved in this way, however. In the case of Production 6, the right-hand side of term starts with a factor which, in tum, starts with either a number or left parenthesis. Consequently, the
parser would like to apply Production 6 when a term is being replaced and the next input symbol is a number or left parenthesis. Production 5-the other right-hand side-starts with a term, which can start with a factor, which can start with a number or left parenthesis, and these are the same symbols that were used to choose Production 6.
Question:
That second quote from the book got me completely lost. So by using an example of some statements as (for example) 5 + (7*4) + 14:
What's the difference between factor and term? using the same example
Why can't a recursive-descent parser handle left-recursion productions? (Explain second quote).
What's the difference between factor and term? using the same example
I am not giving the same example as it won't give you clear picture of what you have doubt about!
Given,
term ::= term * factor | factor
factor ::= number | (expression)
Now,suppose if I ask you to find the factors and terms in the expression 2*3*4.
Now,multiplication being left associative, will be evaluated as :-
(2*3)*4
As you can see, here (2*3) is the term and factor is 4(a number). Similarly you can extend this approach upto any level to draw the idea about term.
As per given grammar, if there's a multiplication chain in the given expression, then its sub-part,leaving a single factor, is a term ,which in turn yields another sub-part---the another term, leaving another single factor and so on. This is how expressions are evaluated.
Why can't a recursive-descent parser handle left-recursion productions? (Explain second quote).
Your second statement is quite clear in its essence. A recursive descent parser is a kind of top-down parser built from a set of mutually recursive procedures (or a non-recursive equivalent) where each such procedure usually implements one of the productions of the grammar.
It is said so because it's clear that recursive descent parser will go into infinite loop if the non-terminal keeps on expanding into itself.
Similarly, talking about a recursive descent parser,even with backtracking---When we try to expand a non-terminal, we may eventually find ourselves again trying to expand the same non-terminal without having consumed any input.
A-> Ab
Here,while expanding the non-terminal A can be kept on expanding into
A-> AAb -> AAAb -> ... -> infinite loop of A.
Hence, we prevent left-recursive productions while working with recursive-descent parsers.
The rule factor matches the string "1*3", the rule term does not (though it would match "(1*3)". In essence each rule represents one level of precedence. expression contains the operators with the lowest precedence, factor the second lowest and term the highest. If you're in term and you want to use an operator with lower precedence, you need to add parentheses.
If you implement a recursive descent parser using recursive functions, a rule like a ::= b "*" c | d might be implemented like this:
// Takes the entire input string and the index at which we currently are
// Returns the index after the rule was matched or throws an exception
// if the rule failed
parse_a(input, index) {
try {
after_b = parse_b(input, index)
after_star = parse_string("*", input, after_b)
after_c = parse_c(input, after_star)
return after_c
} catch(ParseFailure) {
// If one of the rules b, "*" or c did not match, try d instead
return parse_d(input, index)
}
}
Something like this would work fine (in practice you might not actually want to use recursive functions, but the approach you'd use instead would still behave similarly). Now, let's consider the left-recursive rule a ::= a "*" b | c instead:
parse_a(input, index) {
try {
after_a = parse_a(input, index)
after_star = parse_string("*", input, after_a)
after_b = parse_c(input, after_star)
return after_b
} catch(ParseFailure) {
// If one of the rules a, "*" or b did not match, try c instead
return parse_c(input, index)
}
}
Now the first thing that the function parse_a does is to call itself again at the same index. This recursive call will again call itself. And this will continue ad infinitum, or rather until the stack overflows and the whole program comes crashing down. If we use a more efficient approach instead of recursive functions, we'll actually get an infinite loop rather than a stack overflow. Either way we don't get the result we want.

Divided operation in Swift

Why i get error constantly on that?
var rotation:Float= Double(arc4random_uniform(50))/ Double(100-0.2)
Actually i try this one too:
var rotation:Double= Double(arc4random_uniform(50))/ Double(100-0.2)
thank you
Swift has strict rules about the whitespace around operators. Divide '/' is a binary operator.
The important rules are:
If an operator has whitespace around both sides or around neither
side, it is treated as a binary operator. As an example, the +
operator in a+b and a + b is treated as a binary operator.
If an operator has whitespace on the left side only, it is treated as a
prefix unary operator. As an example, the ++ operator in a ++b is
treated as a prefix unary operator.
If an operator has whitespace on
the right side only, it is treated as a postfix unary operator. As an
example, the ++ operator in a++ b is treated as a postfix unary
operator.
That means that you need to add a space before the / or remove the space after it to indicate that it is a binary operator:
var rotation = Double(arc4random_uniform(50)) / (100.0 - 0.2)
If you want rotation to be a Float, you should use that instead of Double:
var rotation = Float(arc4random_uniform(50)) / (100.0 - 0.2)
There is no need to specify the type explicitly since it will be inferred from the value you are assigning to. Also, you do not need to explicitly construct your literals as a specific type as those will conform to the type you are using them with.

Pseudo Code for converting infix to postfix

I'm struggling getting the pseudo code for this.
Scan string left to right for each char
If operand add it to string
Else if operator add to stack
....
i'm struggling on how to handle ( )s
Have you tried these links yet?
http://www.geocities.com/e_i_search/premshree/web-include/pub/infix-postfix/index.htm
http://code.activestate.com/recipes/228915-infixpostfix/
( goes on to the stack, then when you get to ) you pop from the stack until you find a (.
Wikipedia has a more detailed description of the algorithm, supporting functions as well as operators.
Scan input string from left to right character by character.
If the character is an operand, put it into output stack.
If the character is an operator and operator's stack is empty, push operator
into operators' stack.
If the operator's stack is not empty, there may be following possibilities.
If the precedence of scanned operator is greater than the top most operator
of operator's stack, push this operator into operand's stack.
If the precedence of scanned operator is less than or equal to the top most
operator of operator's stack, pop the operators from operand's stack until
we find a low precedence operator than the scanned character. Never pop out
( '(' ) or ( ')' ) whatever may be the precedence level of scanned
character.
If the character is opening round bracket ( '(' ), push it into operator's
stack.
If the character is closing round bracket ( ')' ), pop out operators from
operator's stack until we find an opening bracket ('(' ).
Now pop out all the remaining operators from the operator's stack and push
into output stack.
I am a bit rusty at this, but when you encounter a '(' , you push it onto the stack because it has the highest precedence. I cant remember what to do when you encounter ')', but i think it goes on the stack as well because its the highest precedence.

Resources