Evaluate already parsed duration expression - parsing

I'm writing a tool which reads go code and parses some duration expressions like: dur := 5 * time.Minute. I already have the parsing step done and got a *ast.BinaryExpr. How can I evaluate this expression and get its value?
Is there something in the toolchain/packages or do I need to go by hand?

I think parser package is the one your looking for.
go also has a package named eval (https://godoc.org/github.com/apaxa-go/eval) that evaluates expressions.
And there are two libraries that might help you down the line.
https://github.com/PaesslerAG/gval
https://github.com/Knetic/govaluate
Gval (Go eVALuate) provides support for evaluating arbitrary expressions, in particular Go-like expressions.
Good luck!

Related

iOS DDMathParser get solving steps of math expression

I am building a some kind of calculator for iOS using DDMathParser. I would like to know how to separate an expression (NSString) like 3+3*4 (After parsing add(3,multiply(3,4))) into its individual steps:multiply(3,4) add(3,12). I would prefer that I get these in an NSArray or other relatively useful list object.
Example: add(3,multiply(3,4))
[1] multiply(3,4)
[2] add(3,12)
[3] 15
DDMathParser author here.
This should be pretty straight forward. The parser can give you back the well-formed expression tree instead of just the numeric evaluation. With the expression tree, you'd do something like this:
Find the deepest expression in the tree. This is the first thing that will be evaluated.
Evaluate that expression manually
Substitute the evaluation back in to the tree in place of the original expression
Create a copy of the entire tree to refer to what it looked at this point.
Repeat until there's nothing more to evaluate.

Is possible to use F# pattern matching as a solver/library for another language or DSL?

I'm building a toy language, I want to have pattern matching. I could build the whole thing itself (and don't know how) but because I will do it in F# I wonder if I can defer the whole thing to it.
So, I have a interpreter and my custom syntax. If I give a AST, is possible to use F# use it to solve the pattern matching?
Another way to look at this, is possible to use F# pattern matching from C# and other .NET languages?
For example, if the program is (in invented syntax almost like F#):
case x with
( 1 , 2 , Three):
print("Found 1, 2, or 3!")
else var1:
print("%d" % var1)
Is possible to do
matched, error = F#MagicHere.Match(EvaluateMyAST)
I'm not exactly sure if I understand your question correctly, but I suspect you could use F# Compiler Service to do what you need. Basically, the compiler service lets you call some of the tasks that the compiler performs (and it is a normal .NET library).
The first step would be to turn your AST into valid F# code - I guess you could find a systematic way of doing this (but it requires some thinking). Then you can use F.C.S to:
Type-check the expression, which gives you warnings for overlapping cases and missing cases (e.g. when you have an incomplete pattern match).
It gives you the AST of the pattern matching and I believe you can even get a decision tree that you can interpret to evaluate the pattern matching.
Alternatively, you could use F.C.S to compile the code and run it (provided that you can translate your DSL to F# code)

use variables in Lemon parser?

I want to allow mathematical variables in my Lemon parser-driven app. For example, if the user enters x^2+y, I want to then be able to evaluate this for 100000 different pairs of values of x and y, hopefully without having to reparse each time. The only way I can think of to do that is to have the parser generate a tree of objects, which then evaluates the expression when given the input. Is there a better/simpler/faster way?
Performance may be an issue here. But I also care about ease of coding and code upkeep.
If you want the most maintainable code, evaluate the expression as you parse. Don't build a tree.
If you want to re-execute the expression a lot, and the expression is complicated, you'll need to avoid reparsing (in order of most to least maintainable): build tree and evaluate, generate threaded code and evaluate, generate native code and evaluate.
If the expressions are generally as simple as your example, a recursive descent hand-coded parser that evaluates on the fly will likely be very fast, and work pretty well, even for 100,000 iterations. Such parsers will likely take much less time to execute than Lemon.
That is indeed how you would typically do it, unless you want to generate actual (real or virtual) code. x and y would just be variables in your case so you would fill in the actual values and then call your Evaluate function to evaluate the expression. The tree node would then contain pointers to the variables x and y, and so on. No need to parse it for each pair of test values.

Building a parser (Part I)

I'm making my own javascript-based programming language (yeah, it is crazy, but it's for learn only... maybe?). Well, I'm reading about parsers and the first pass is to convert the code source to tokens, like:
if(x > 5)
return true;
Tokenizer to:
T_IF "if"
T_LPAREN "("
T_IDENTIFIER "x"
T_GT ">"
T_NUMBER "5"
T_RPAREN ")"
T_IDENTIFIER "return"
T_TRUE "true"
T_TERMINATOR ";"
I don't know if my logic is correct for that for while. On my parser it is even better (or not?) and translate to it (yeah, multidimensional array):
T_IF "if"
T_EXPRESSION ...
T_IDENTIFIER "x"
T_GT ">"
T_NUMBER "5"
T_CLOSURE ...
T_IDENTIFIER "return"
T_TRUE "true"
I have some doubts:
Is my way better or worse that the original way? Note that my code will be read and compiled (translated to another language, like PHP), instead of interpreted all the time.
After I tokenizer, what I need do exactly? I'm really lost on this pass!
There are some good tutorial to learn how I can do it?
Well, is that. Bye!
Generally, you want to separate the functions of the tokeniser (also called a lexer) from other stages of your compiler or interpreter. The reason for this is basic modularity: each pass consumes one kind of thing (e.g., characters) and produces another one (e.g., tokens).
So you’ve converted your characters to tokens. Now you want to convert your flat list of tokens to meaningful nested expressions, and this is what is conventionally called parsing. For a JavaScript-like language, you should look into recursive descent parsing. For parsing expressions with infix operators of different precedence levels, Pratt parsing is very useful, and you can fall back on ordinary recursive descent parsing for special cases.
Just to give you a more concrete example based on your case, I’ll assume you can write two functions: accept(token) and expect(token), which test the next token in the stream you’ve created. You’ll make a function for each type of statement or expression in the grammar of your language. Here’s Pythonish pseudocode for a statement() function, for instance:
def statement():
if accept("if"):
x = expression()
y = statement()
return IfStatement(x, y)
elif accept("return"):
x = expression()
return ReturnStatement(x)
elif accept("{")
xs = []
while True:
xs.append(statement())
if not accept(";"):
break
expect("}")
return Block(xs)
else:
error("Invalid statement!")
This gives you what’s called an abstract syntax tree (AST) of your program, which you can then manipulate (optimisation and analysis), output (compilation), or run (interpretation).
Most toolkits split the complete process into two separate parts
lexer (aka. tokenizer)
parser (aka. grammar)
The tokenizer will split the input data into tokens. The parser will only operate on the token "stream" and build the structure.
Your question seems to be focused on the tokenizer. But your second solution mixes the grammar parser and the tokenizer into one step. Theoretically this is also possible but for a beginner it is much easier to do it the same way as most other tools/framework: keep the steps separate.
To your first solution: I would tokenize your example like this:
T_KEYWORD_IF "if"
T_LPAREN "("
T_IDENTIFIER "x"
T_GT ">"
T_LITARAL "5"
T_RPAREN ")"
T_KEYWORD_RET "return"
T_KEYWORD_TRUE "true"
T_TERMINATOR ";"
In most languages keywords cannot be used as method names, variable names and so on. This is reflected already on the tokenizer level (T_KEYWORD_IF, T_KEYWORD_RET, T_KEYWORD_TRUE).
The next level would take this stream and - by applying a formal grammar - would build some datastructure (often called AST - Abstract Syntax Tree) which might look like this:
IfStatement:
Expression:
BinaryOperator:
Operator: T_GT
LeftOperand:
IdentifierExpression:
"x"
RightOperand:
LiteralExpression
5
IfBlock
ReturnStatement
ReturnExpression
LiteralExpression
"true"
ElseBlock (empty)
Implementing the parser by hand is usually done by some frameworks. Implementing something like that by hand and efficiently is usually done at a university in the better part of a semester. So you really should use some kind of framework.
The input for a grammar parser framework is usually a formal grammar in some kind of BNF. Your "if" part migh look like this:
IfStatement: T_KEYWORD_IF T_LPAREN Expression T_RPAREN Statement ;
Expression: LiteralExpression | BinaryExpression | IdentifierExpression | ... ;
BinaryExpression: LeftOperand BinaryOperator RightOperand;
....
That's only to get the idea. Parsing a realworld-language like Javascript correctly is not an easy task. But funny.
Is my way better or worse that the original way? Note that my code will be read and compiled (translated to another language, like PHP), instead of interpreted all the time.
What's the original way ? There are many different ways to implement languages. I think yours is fine actually, I once tried to build a language myself that translated to C#, the hack programming language. Many language compilers translate to an intermediate language, it's quite common.
After I tokenizer, what I need do exactly? I'm really lost on this pass!
After tokenizing, you need to parse it. Use some good lexer / parser framework, such as the Boost.Spirit, or Coco, or whatever. There are hundreds of them. Or you can implement your own lexer, but that takes time and resources. There are many ways to parse code, I generally rely on recursive descent parsing.
Next you need to do Code Generation. That's the most difficult part in my opinion. There are tools for that too, but you can do it manually if you want to, I tried to do it in my project, but it was pretty basic and buggy, there's some helpful code here and here.
There are some good tutorial to learn how I can do it?
As I suggested earlier, use tools to do it. There are a lot of pretty good well-documented parser frameworks. For further information, you can try asking some people who know about this stuff. #DeadMG , over at the Lounge C++ is building a programming language called "Wide". You may try consulting him.
Let's say I have this statement in a programming language:
if (0 < 1) then
print("Hello")
The lexer will translate it into:
keyword: if
num: 0
op: <
num: 1
keyword: then
keyword: print
string: "Hello"
The parser will then take the information (aka "Token Stream") and make this:
if:
expression:
<:
0, 1
then:
print:
"Hello"
I don't know if this will help or not, but I hope it does.

Trying to write a parser

I'm trying to parse a syntax using the Shunting Yard (SY) algorithm. The syntax includes the following commands (they're are many many others though!)
a + b // a and b are numbers
setxy c d //c,d can be numbers
setxy c+d b+a //all numbers
Essentially, setxy is a function but it doesn't expect any function argument separators. This makes it very difficult (impossible?) to do via SY due to the lack of parens and function argument separators.
Any idea if SY can be used to parse a parentheses-less/function argument separator-less function or should I move on to a different parsing algorithm? If so, which one would you recommend?
Thanks!
djs22
Having defined correct grammar you can make http://www.antlr.org/ generate parser for you. Whether it is appropriate solution depends on your homework "requirements".
At least you can generate it and look inside for some hints.
I don't fully understand what you are trying to do, but perhaps you could use some regex? what are you trying to do write a simple command line program?

Resources