Does the MathNet.Symbolics.Infix parser have a way of identifying more complicated trig functions such as tanh? I have tried the following in F# but it is not recognised (I get an undefined expression). When I replace 'tanh' with 'cos' it works just fine.
open MathNet.Symbolics
let exp = Infix.parseOrUndefined "tanh(x)" //undefined
printfn "%s" (LaTeX.format exp)
Any other libraries for parsing mathematics expressions in F# would also be of interest to me if they can handle functions such as tanh. Thanks!
Math.NET Symbolics can work with hyperbolic functions. You got the wrong result, because there was a small error in the library. Now it is fixed. You can wait for the next version or build DLL from source.
Related
There is need for tracing. The decorator should print function name, parameters values and return value. Instead of writing each time a decorator for each function, it would be terrific if it could be possible to do this programmatically.
The current function name can be discovered using reflection via MethodBase.GetCurrentMethod. Functions could be easily decorated with an inline function for logging:
let inline log args f =
let mi = System.Reflection.MethodBase.GetCurrentMethod()
let result = f ()
printf "%s %A -> %A" mi.Name args result
let add a b = log (a,b) (fun () -> a + b)
add 1 1
Which prints: add (1, 1) -> 2
EDIT: Another option would be to create a wrap function, i.e.:
let inline wrap f =
fun ps ->
let result = f args
printfn "%A -> %A" args result
result
let add (a,b) = a + b
wrap add (1,1)
However in this case there is not an easy way to programmatically retrieve the function name.
Yet another option might be to develop a Type Provider that takes an assembly path as a parameter and provides wrapped versions of all members.
I had the same desire previously and found that there are no current automated solutions for F#.
See: Converting OCaml to F#: Is there a simple way to simulate OCaml top-level #trace in F#
While OCaml's trace facility with time travel is the most useful debugging feature of comparison to what is desired, it is not an exact fit; but when I use OCaml it is the first inspection tool I use.
See: Using PostSharp with F# - Need documentation with working example
Also the suggestion of using AOP, i.e. PostSharp, was another good suggestion, as the response from Gael Fraiteur, the Principal Engineer of PostSharp points out:
PostSharp does not officially support F#.
Other than using reflection as suggested by Phillip Trelford, which I have not tried, the best solution I have found is to manually modify each function to be traced as I noted in Converting OCaml to F#: Is there a simple way to simulate OCaml top-level #trace in F# and save the results to a separate log file using NLog.
See: Using NLog with F# Interactive in Visual Studio - Need documentation
Another route to pursue would be check out the work of F# on Mono as there is a lot of work being done there to add extra tooling for use with F#.
Basically what I have found is that as my F# skills increase my need
to use a debugger or tracing decrease.
At present when I do run into a problem needing this level of inspection, adding the inspection code as noted in Converting OCaml to F#: Is there a simple way to simulate OCaml top-level #trace in F# helps to resolve my misunderstanding.
Also of note is that people who come from the C# world to the F# world tend to expect a debugger to be just as useful. Remember that imperative languages tend to be about manipulating data held in variables and that the debugger is used to inspect the values in these variables, while with functional programming, at least for me, I try to avoid mutable values and so the only values one needs to inspect are the values being passed to the function and no other values, thus reducing or obviating the need for a debugger or inspection beyond that of the function of question.
I'm following a language called 'elm' which is an attempt to bring a Haskel-esque syntax and FRP to Javascript. There has been some discussion here about implementing the pipeline operator from F# but the language designer has concerns about the increased cost (I assume in increased compilation time or compiler implementation complexity) over the more standard (in other FP langs at least) reverse pipeline operator (which elm already implements). Can anyone speak to this? [Feel free to post directly to that thread as well or I will paste back the best answers if no one else does].
https://groups.google.com/forum/?fromgroups=#!topic/elm-discuss/Kt0MbDyRpO4
Thanks!
In the discussion you reference, I see Evan poses two challenges:
Show me some F# project that uses it
Find some credible F# programmer talking about why it is a good idea and what costs come with it (blog post or something).
I'd answer as follows:
The forward pipe-idiom is very common in F# programming, both for stylistic (we like it) and practical (it helps type inference) reasons. Just about any F# project you'll find will use it frequently. Certainly all of my open source projects use it (Unquote, FsEye, NL found here). No doubt you'll find the same with all of the Github located F# projects including the F# compiler source itself.
Brian, a developer on the F# compiler team at Microsoft, blogged about Pipelining in F# back in 2008, a still very interesting and relevant blog which relates F# pipes to POSIX pipes. In my own estimation, there is very little cost to implementing a pipe operator. In the F# compiler, this is certainly true in every sense (it's a one-line, inline function definition).
The pipeline operator is actually incredibly simple - here is the standard definition
let inline (|>) a b = b a
Also, the . operator discussed in the thread is the reverse pipe operator in F# (<|) which enables you to eliminate some brackets.
I don't think adding pipeline operators would have a significant impact on complexity
In addition to the excellent answers already given here, I'd like to add a couple more points.
Firstly, one of the reasons why the pipeline operator is common in F# is that it helps to circumvent a shortcoming the way type inference is currently done. Specifically, if you apply an aggregate operation with a lambda function that uses OOP to a collection type inference will typically fail. For example:
Seq.map (fun z -> z.Real) zs
This fails because F# does not yet know the type of z when it encounters the property Real so it refuses to compile this code. The idiomatic fix is to use the pipeline operator:
xs |> Seq.map (fun z -> z.Real)
This is strictly uglier (IMO) but it works.
Secondly, the F# pipe operator is nice to a point but you cannot currently get the inferred type of an intermediate result. For example:
x
|> h
|> g
|> f
If there is a type error at f then the programmer will want to know the type of the value being fed into f in case the problem was actually with h or g but this is not currently possible in Visual Studio. Ironically, this was easy in OCaml with the Tuareg mode for Emacs because you could get the inferred type of any subexpression, not just an identifier.
I'm looking for a mature parser library, either for Scala or Haskell.
The most important point is, that the library can handle ambiguity.
If an expression is ambiguous, I want every possible abstract syntax tree, that matches the expression.
Simple example: The expression a ⊗ b ⊗ c can be seen as (a ⊗ b) ⊗ c or a ⊗ (b ⊗ c), and I need both variants.
Thanks!
I feel like the old guy for remembering when Walder's papers like Comprehending Monads (the precursor to the do notation) were exciting and new. The idea is that you (to quote) replace a failure by a list of successes, meaning maintain a list of all the possible parses. At the end you normally just take the first match, but with this setup, you can take all of them.
These aren't all that efficient for a deterministic parser, which is why they're less in fashion, but they are what you need.
Have a look at polyparse, and in particular Text.ParserCombinators.HuttonMeijer and Text.ParserCombinators.HuttonMeijerWallace.
(Hutton & Meijer translated the parser library to Haskell (from Gofer) and Wallace added extra features.)
Make sure you check it out on simple cases like parsing "aaaa" with
testP = do
a <- many $ char 'a'
b <- many $ char 'a'
return (a,b)
to see if it has the semantics you seek.
You asked for mature. These libraries are part of pure functional programming's heritage! Having said that, I'd call parsec more mature, even though it's younger.
(Speculation: I don't think parsec can do what you want. Its standard choice combinator is deterministic. I haven't looked into tweaking or replacing that behaviour, and I wouldn't want to I'm afraid.)
This question immediately reminded me of the Yacc is dead / No, it's not debate from the end of 2010. The authors of the Yacc is dead paper provide a library in Scala (unmaintained), Haskell and Racket. In the Yacc is alive response, Russ Cox points out that the code runs in exponential time for ambiguous grammars.
It's well-known that it is possible to parse ambiguous grammars in O(n^3), although obviously it can take exponential time to enumerate all the parse trees in the case that there are exponentially many of them -- and there will be in the case of x1 + x2 + x3 ... + xn. bison implements the GLR algorithm which does so; unfortunately, while bison is certainly mature (if not actually moribund), it is written neither in Haskell nor in Scala.
Daniel Spiewak implemented a GLL parser in Scala IIRC, but last time I looked at it, it suffered from some performance issues. So I'm not sure that it could be described as mature, either.
I can't speak to how mature it is or give you any usage examples, but I've had the scala gll-combinators library open in a tab for a few days. It handles ambiguous grammars and looks pretty nifty.
At the end the the choice fell on the Syntax Definition Formalism (SDF2)
with an sdf table generator here
and JSGLR as parser generator.
I'm making my own javascript-based programming language (yeah, it is crazy, but it's for learn only... maybe?). Well, I'm reading about parsers and the first pass is to convert the code source to tokens, like:
if(x > 5)
return true;
Tokenizer to:
T_IF "if"
T_LPAREN "("
T_IDENTIFIER "x"
T_GT ">"
T_NUMBER "5"
T_RPAREN ")"
T_IDENTIFIER "return"
T_TRUE "true"
T_TERMINATOR ";"
I don't know if my logic is correct for that for while. On my parser it is even better (or not?) and translate to it (yeah, multidimensional array):
T_IF "if"
T_EXPRESSION ...
T_IDENTIFIER "x"
T_GT ">"
T_NUMBER "5"
T_CLOSURE ...
T_IDENTIFIER "return"
T_TRUE "true"
I have some doubts:
Is my way better or worse that the original way? Note that my code will be read and compiled (translated to another language, like PHP), instead of interpreted all the time.
After I tokenizer, what I need do exactly? I'm really lost on this pass!
There are some good tutorial to learn how I can do it?
Well, is that. Bye!
Generally, you want to separate the functions of the tokeniser (also called a lexer) from other stages of your compiler or interpreter. The reason for this is basic modularity: each pass consumes one kind of thing (e.g., characters) and produces another one (e.g., tokens).
So you’ve converted your characters to tokens. Now you want to convert your flat list of tokens to meaningful nested expressions, and this is what is conventionally called parsing. For a JavaScript-like language, you should look into recursive descent parsing. For parsing expressions with infix operators of different precedence levels, Pratt parsing is very useful, and you can fall back on ordinary recursive descent parsing for special cases.
Just to give you a more concrete example based on your case, I’ll assume you can write two functions: accept(token) and expect(token), which test the next token in the stream you’ve created. You’ll make a function for each type of statement or expression in the grammar of your language. Here’s Pythonish pseudocode for a statement() function, for instance:
def statement():
if accept("if"):
x = expression()
y = statement()
return IfStatement(x, y)
elif accept("return"):
x = expression()
return ReturnStatement(x)
elif accept("{")
xs = []
while True:
xs.append(statement())
if not accept(";"):
break
expect("}")
return Block(xs)
else:
error("Invalid statement!")
This gives you what’s called an abstract syntax tree (AST) of your program, which you can then manipulate (optimisation and analysis), output (compilation), or run (interpretation).
Most toolkits split the complete process into two separate parts
lexer (aka. tokenizer)
parser (aka. grammar)
The tokenizer will split the input data into tokens. The parser will only operate on the token "stream" and build the structure.
Your question seems to be focused on the tokenizer. But your second solution mixes the grammar parser and the tokenizer into one step. Theoretically this is also possible but for a beginner it is much easier to do it the same way as most other tools/framework: keep the steps separate.
To your first solution: I would tokenize your example like this:
T_KEYWORD_IF "if"
T_LPAREN "("
T_IDENTIFIER "x"
T_GT ">"
T_LITARAL "5"
T_RPAREN ")"
T_KEYWORD_RET "return"
T_KEYWORD_TRUE "true"
T_TERMINATOR ";"
In most languages keywords cannot be used as method names, variable names and so on. This is reflected already on the tokenizer level (T_KEYWORD_IF, T_KEYWORD_RET, T_KEYWORD_TRUE).
The next level would take this stream and - by applying a formal grammar - would build some datastructure (often called AST - Abstract Syntax Tree) which might look like this:
IfStatement:
Expression:
BinaryOperator:
Operator: T_GT
LeftOperand:
IdentifierExpression:
"x"
RightOperand:
LiteralExpression
5
IfBlock
ReturnStatement
ReturnExpression
LiteralExpression
"true"
ElseBlock (empty)
Implementing the parser by hand is usually done by some frameworks. Implementing something like that by hand and efficiently is usually done at a university in the better part of a semester. So you really should use some kind of framework.
The input for a grammar parser framework is usually a formal grammar in some kind of BNF. Your "if" part migh look like this:
IfStatement: T_KEYWORD_IF T_LPAREN Expression T_RPAREN Statement ;
Expression: LiteralExpression | BinaryExpression | IdentifierExpression | ... ;
BinaryExpression: LeftOperand BinaryOperator RightOperand;
....
That's only to get the idea. Parsing a realworld-language like Javascript correctly is not an easy task. But funny.
Is my way better or worse that the original way? Note that my code will be read and compiled (translated to another language, like PHP), instead of interpreted all the time.
What's the original way ? There are many different ways to implement languages. I think yours is fine actually, I once tried to build a language myself that translated to C#, the hack programming language. Many language compilers translate to an intermediate language, it's quite common.
After I tokenizer, what I need do exactly? I'm really lost on this pass!
After tokenizing, you need to parse it. Use some good lexer / parser framework, such as the Boost.Spirit, or Coco, or whatever. There are hundreds of them. Or you can implement your own lexer, but that takes time and resources. There are many ways to parse code, I generally rely on recursive descent parsing.
Next you need to do Code Generation. That's the most difficult part in my opinion. There are tools for that too, but you can do it manually if you want to, I tried to do it in my project, but it was pretty basic and buggy, there's some helpful code here and here.
There are some good tutorial to learn how I can do it?
As I suggested earlier, use tools to do it. There are a lot of pretty good well-documented parser frameworks. For further information, you can try asking some people who know about this stuff. #DeadMG , over at the Lounge C++ is building a programming language called "Wide". You may try consulting him.
Let's say I have this statement in a programming language:
if (0 < 1) then
print("Hello")
The lexer will translate it into:
keyword: if
num: 0
op: <
num: 1
keyword: then
keyword: print
string: "Hello"
The parser will then take the information (aka "Token Stream") and make this:
if:
expression:
<:
0, 1
then:
print:
"Hello"
I don't know if this will help or not, but I hope it does.
By concept/function/implementation, what are the differences between compilers and parsers?
A compiler is often made up of several components, one of which is a parser.
A common set of components in a compiler is:
Lexer - break the program up into words.
Parser - check that the syntax of the sentences are correct.
Semantic Analysis - check that the sentences make sense.
Optimizer - edit the sentences for brevity.
Code generator - output something with equivalent semantic meaning using another vocabulary.
To add a little bit:
As mentioned elsewhere, small C is a recursive decent compiler that generated code as it parsed. Basically syntactical analysis, semantic analysis, and code generation in one pass. As I recall, it also lexed in the parser.
A long time ago, I wrote a C compiler (actually several: the Introl-C family for microcontrollers) that used recursive descent and did syntax and semantic checking during the parse and produced a tree representation of the program from which code was generated.
Today, I'm working on a compiler that does source -> tokens -> AST -> IR -> code, pretty much as I described above.
A parser just reads a text into an internal, more abstract representation, often a tree or graph of some sort.
A compiler translates such an internal representation into another format. Most often this means converting source code into executable programs. But the target doesn't have to be machine code. It can be another programming language as well; the compiler would still be a compiler. Obviously a compiler needs a parser to actually read its input.
Compiler always have a parser inside. Parser just process the language and return the tree representation of it, compiler generate something from that tree, actual machine codes or another language.
A parser is one element of a compiler.
Are you looking for the differences between an interpreter and a compiler?
A parser takes in raw-data and parses it into a tree structure. This syntax-tree is then passed on to generator, which will turn it into whatever it is supposed to generate.
So, a parser is a part of a compiler.
In general, parser is a part of the compiler, but compiler is designed to convert the received script generally into machine-readable code or sometimes into another language.
A compiler is a special type of computer program that translates a human readable text file into a form that the computer can more easily understand. At its most basic level, a computer can only understand two things, a 1 and a 0. At this level, a human will operate very slowly and find the information contained in the long string of 1s and 0s incomprehensible. A compiler is a computer program that bridges this gap.
A parser is a piece of software that evaluates the syntax of a script when it is executed on a web server. For scripting languages used on the web, the parser works like a compiler might work in other types of application development environments.Parsers are commonly used in script development because they can evaluate code when the script is executed and do not require that the code be compiled first.