I'm struggling a bizarr parse error when running Kastree Parser to build an AST from this example code written in Kotlin:
fun bar() {
val a = "constant"
val b = 0
while (b < 10) {
if (b < 5) {
println("b lt 5")
} else {
println("b gt 5")
}
}
if (true)
return
if (false) {
return
}
println(a)
}
I use this main to parse program above:
import kastree.ast.psi.Parser
import java.io.File
import java.time.LocalDateTime
fun main(args: Array<String>) {
val fileName = "C:/Users/LETS_GO_ON/Documents/kotcfg/KotlinCFG/src/test/kotlin/kotlincfg/Test11.kt"
val file = File(fileName)
if (!file.exists()){
println("File does not exist")
return
}
val codeStr = file.readText()
val fileAst = Parser.parseFile(codeStr)
val builder = GraphBuilder(fileAst)
val graph = builder.build()
val dotName = "dot.dot"
val imgName = "graph-${LocalDateTime.now()}.png"
exportToDot(graph, dotName)
dotFileToImage(dotName, imgName)
}
Once it starts, an error message gets printed:
Exception in thread "main" ParseError(file=KtFile: temp.kt, errors=[PsiErrorElement:Expecting a top level declaration, PsiErrorElement:Expecting a top level declaration, PsiErrorElement:Expecting an element, PsiErrorElement:Expecting an expression, PsiErrorElement:Expecting '->', PsiErrorElement:Expecting an expression, is-condition or in-condition, PsiErrorElement:Expecting an expression, PsiErrorElement:Expecting '->', PsiErrorElement:Expecting an expression, is-condition or in-condition, PsiErrorElement:Expecting an expression, PsiErrorElement:Expecting '->', PsiErrorElement:Expecting an element, PsiErrorElement:Unexpected tokens (use ';' to separate expressions on the same line), PsiErrorElement:Expecting an element, PsiErrorElement:Expecting an element, PsiErrorElement:Unexpected tokens (use ';' to separate expressions on the same line), PsiErrorElement:Expecting an element])
at kastree.ast.psi.Parser.parseFile(Parser.kt:25)
at kastree.ast.psi.Parser.parseFile$default(Parser.kt:23)
at main.kotlin.kotlincfg.MainKt.main(Main.kt:20)
Kastree dependency in maven project:
<dependency>
<groupId>com.github.cretz.kastree</groupId>
<artifactId>kastree-ast-psi</artifactId>
<version>0.4.0</version>
</dependency>
OS is Windows 10 Home, I apply JDK 1.8.0_271 to run a maven project in IntelliJ.
So, I wonder if somebody would explain what is wrong here in such a (seemingly) simple case for parsing the code.
Seems, that this parser doesn't know about CRLF newline format.
Workaround to handle this is simple - replace CRLF with LF:
val codeStr = file.readText().replace("\r\n", "\n")
Related
I'm trying to understand the reader monad transformer. I'm using FSharpPlus and try to compile the following sample which first reads something from the reader environment, then performs some async computation and finally combines both results:
open FSharpPlus
open FSharpPlus.Data
let sampleReader = monad {
let! value = ask
return value * 2
}
let sampleWorkflow = monad {
do! Async.Sleep 5000
return 4
}
let doWork = monad {
let! envValue = sampleReader
let! workValue = liftAsync sampleWorkflow
return envValue + workValue
}
ReaderT.run doWork 3 |> Async.RunSynchronously |> printfn "Result: %d"
With this I get a compilation error at the line where it says let! value = ask with the following totally unhelpful (at least for me) error message:
Type constraint mismatch when applying the default type 'obj' for a type inference variable. No overloads match for method 'op_GreaterGreaterEquals'.
Known return type: Async
Known type parameters: < obj , (int -> Async) >
It feels like I'm just missing some operator somewhere, but I can't figure it out.
Your code is correct, but F# type inference is not that smart in cases like this.
If you add a type annotation to sampleReader it will compile fine:
let sampleReader : ReaderT<int,Async<_>> = monad {
let! value = ask
return value * 2
}
// val sampleReader : FSharpPlus.Data.ReaderT<int,Async<int>> =
// ReaderT <fun:sampleReader#7>
Update:
After reading your comments.
If what you want is to make it generic, first of all your function has to be declared inline otherwise type constraints can't be applied:
let inline sampleReader = monad ...
But that takes you to the second problem: a constant can't be declared inline (actually there is a way but it's too complicated) only functions can.
So the easiest is to make it a function:
let inline sampleReader () = monad ...
And now the third problem the code doesn't compile :)
Here again, you can give type inference a minimal hint, just to say at the call site that you expect a ReaderT<_,_> will be enough:
let inline sampleReader () = monad {
let! value = ask
return value * 2
}
let sampleWorkflow = monad {
do! Async.Sleep 5000
return 4
}
let doWork = monad {
let! envValue = sampleReader () : ReaderT<_,_>
let! workValue = liftAsync sampleWorkflow
return envValue + workValue
}
ReaderT.run doWork 3 |> Async.RunSynchronously |> printfn "Result: %d"
Conclusion:
Defining a generic function is not that trivial task in F#.
If you look into the source of F#+ you'll see what I mean.
After running your example you'll see all the constraints being generated and you'll probably noted how the compile-time increased by making your function inline and generic.
These are all indications that we're pushing F# type system to the limits.
Although F#+ defines some ready-to-use generic functions, and these functions can sometimes be combined in such a way that you create your own generic functions, that's not the goal of the library, I mean you can but then you're on your own, in some scenarios like exploratory development it might make sense.
I don't know if this info is relevant to the question, but I am learning Scala parser combinators.
Using some examples (in this master thesis) I was able to write a simple functional (in the sense that it is non imperative) programming language.
Is there a way to improve my parser/evaluator such that it could allow/evaluate input like this:
<%
import scala.<some package / classes>
import weka.<some package / classes>
%>
some DSL code (lambda calculus)
<%
System.out.println("asdasd");
J48 j48 = new J48();
%>
as input written in the guest language (DSL)?
Should I use reflection or something similar* to evaluate such input?
Is there some source code recommendation to study (may be groovy sources?)?
Maybe this is something similar: runtime compilation, but I am not sure this is the best alternative.
EDIT
Complete answer given bellow with "{" and "}". Maybe "{{" would be better.
It is the question as to what the meaning of such import statements should be.
Perhaps you start first with allowing references to java methods in your language (the Lambda Calculus, I guess?).
For example:
java.lang.System.out.println "foo"
If you have that, you can then add resolution of unqualified names like
println "foo"
But here comes the first problem: println exists in System.out and System.err, or, to be more correct: it is a method of PrintStream, and both System.err and System.out are PrintStreams.
Hence you would need some notion of Objects, Classes, Types, and so on to do it right.
I managed how to run Scala code embedded in my interpreted DSL.
Insertion of DSL vars into Scala code and recovering returning value comes as a bonus. :)
Minimal relevant code from parsing and interpreting until performing embedded Scala code run-time execution (Main Parser AST and Interpreter):
object Main extends App {
val ast = Parser1 parse "some dsl code here"
Interpreter eval ast
}
object Parser1 extends RegexParsers with ImplicitConversions {
import AST._
val separator = ";"
def parse(input: String): Expr = parseAll(program, input).get
type P[+T] = Parser[T]
def program = rep1sep(expr, separator) <~ separator ^^ Sequence
def expr: Parser[Expr] = (assign /*more calls here*/)
def scalacode: P[Expr] = "{" ~> rep(scala_text) <~ "}" ^^ {case l => Scalacode(l.flatten)}
def scala_text = text_no_braces ~ "$" ~ ident ~ text_no_braces ^^ {case a ~ b ~ c ~ d => List(a, b + c, d)}
//more rules here
def assign = ident ~ ("=" ~> atomic_expr) ^^ Assign
//more rules here
def atomic_expr = (
ident ^^ Var
//more calls here
| "(" ~> expr <~ ")"
| scalacode
| failure("expression expected")
)
def text_no_braces = """[a-zA-Z0-9\"\'\+\-\_!##%\&\(\)\[\]\/\?\:;\.\>\<\,\|= \*\\\n]*""".r //| fail("Scala code expected")
def ident = """[a-zA-Z]+[a-zA-Z0-9]*""".r
}
object AST {
sealed abstract class Expr
// more classes here
case class Scalacode(items: List[String]) extends Expr
case class Literal(v: Any) extends Expr
case class Var(name: String) extends Expr
}
object Interpreter {
import AST._
val env = collection.immutable.Map[VarName, VarValue]()
def run(code: String) = {
val code2 = "val res_1 = (" + code + ")"
interpret.interpret(code2)
val res = interpret.valueOfTerm("res_1")
if (res == None) Literal() else Literal(res.get)
}
class Context(private var env: Environment = initEnv) {
def eval(e: Expr): Any = e match {
case Scalacode(l: List[String]) => {
val r = l map {
x =>
if (x.startsWith("$")) {
eval(Var(x.drop(1)))
} else {
x
}
}
eval(run(r.mkString))
}
case Assign(id, expr) => env += (id -> eval(expr))
//more pattern matching here
case Literal(v) => v
case Var(id) => {
env getOrElse(id, sys.error("Undefined " + id))
}
}
}
}
First, the code:
package com.digitaldoodles.markup
import scala.util.parsing.combinator.{Parsers, RegexParsers}
import com.digitaldoodles.rex._
class MarkupParser extends RegexParsers {
val stopTokens = (Lit("{{") | "}}" | ";;" | ",,").lookahead
val name: Parser[String] = """[##!$]?[a-zA-Z][a-zA-Z0-9]*""".r
val content: Parser[String] = (patterns.CharAny ** 0 & stopTokens).regex
val function: Parser[Any] = name ~ repsep(content, "::") <~ ";;"
val block1: Parser[Any] = "{{" ~> function
val block2: Parser[Any] = "{{" ~> function <~ "}}"
val lst: Parser[Any] = repsep("[a-z]", ",")
}
object ParseExpr extends MarkupParser {
def main(args: Array[String]) {
println("Content regex is ", (patterns.CharAny ** 0 & stopTokens).regex)
println(parseAll(block1, "{{#name 3:4:foo;;"))
println(parseAll(block2, "{{#name 3:4:foo;; stuff}}"))
println(parseAll(lst, "a,b,c"))
}
}
then, the run results:
[info] == run ==
[info] Running com.digitaldoodles.markup.ParseExpr
(Content regex is ,(?:[\s\S]{0,})(?=(?:(?:\{\{|\}\})|;;)|\,\,))
[1.18] parsed: (#name~List(3:4:foo))
[1.24] failure: `;;' expected but `}' found
{{#name 3:4:foo;; stuff}}
^
[1.1] failure: string matching regex `\z' expected but `a' found
a,b,c
^
I use a custom library to assemble some of my regexes, so I've printed out the "content" regex; its supposed to be basically any text up to but not including certain token patterns, enforced using a positive lookahead assertion.
Finally, the problems:
1) The first run on "block1" succeeds, but shouldn't, because the separator in the "repsep" function is "::", yet ":" are parsed as separators.
2) The run on "block2" fails, presumably because the lookahead clause isn't working--but I can't figure out why this should be. The lookahead clause was already exercised in the "repsep" on the run on "block1" and seemed to work there, so why should it fail on block 2?
3) The simple repsep exercise on "lst" fails because internally, the parser engine seems to be looking for a boundary--is this something I need to work around somehow?
Thanks,
Ken
1) No, "::" are not parsed as separators. If it did, the output would be (#name~List(3, 4, foo)).
2) It happens because "}}" is also a delimiter, so it takes the longest match it can -- the one that includes ";;" as well. If you make the preceding expression non-eager, it will then fail at "s" on "stuff", which I presume is what you expected.
3) You passed a literal, not a regex. Modify "[a-z]" to "[a-z]".r and it will work.
When I tried the console programming, I received unexpected result.
open System
let printSomeMessage =
printfn "Is this the F# BUG?"
[<EntryPoint>]
let main args =
if args.Length = 2 then
printSomeMessage
else
printfn "Args.Length is not two."
0
The printSomeMessage function was included in .cctor() function. Here is IL DASM result.
.method private specialname rtspecialname static
void .cctor() cil managed
{
// Code size 24 (0x18)
.maxstack 4
IL_0000: nop
IL_0001: ldstr "Is this the F# BUG\?"
IL_0006: newobj instance void class [FSharp.Core]Microsoft.FSharp.Core.PrintfFormat`5<class [FSharp.Core]Microsoft.FSharp.Core.Unit,class [mscorlib]System.IO.TextWriter,class [FSharp.Core]Microsoft.FSharp.Core.Unit,class [FSharp.Core]Microsoft.FSharp.Core.Unit,class [FSharp.Core]Microsoft.FSharp.Core.Unit>::.ctor(string)
IL_000b: call !!0 [FSharp.Core]Microsoft.FSharp.Core.ExtraTopLevelOperators::PrintFormatLine<class [FSharp.Core]Microsoft.FSharp.Core.Unit>(class [FSharp.Core]Microsoft.FSharp.Core.PrintfFormat`4<!!0,class [mscorlib]System.IO.TextWriter,class [FSharp.Core]Microsoft.FSharp.Core.Unit,class [FSharp.Core]Microsoft.FSharp.Core.Unit>)
IL_0010: dup
IL_0011: stsfld class [FSharp.Core]Microsoft.FSharp.Core.Unit '<StartupCode$FSharpBugTest>'.$Program::printSomeMessage#3
IL_0016: pop
IL_0017: ret
} // end of method $Program::.cctor
So, its execution result is like this.
Is this the F# BUG?
Args.Length is not two.
Am I missing some grammar or F# characteristic? Or F# builder’s BUG?
No it's a bug in your code. You need to add parentheses after "printSomeMessage", otherwise printSomeMessage is a simple value rather than a function.
open System
let printSomeMessage() =
printfn "Is this the F# BUG?"
[<EntryPoint>]
let main args =
if args.Length = 2 then
printSomeMessage()
else
printfn "Args.Length is not two."
0
Simple values are initialized in the constructor of a module, so you see your code being called when the module is initialized. This is logical when you think about it, the normal case of simple values would be binding a string, integer, or other literal value to an identifier. You would expect this to happen a start up. i.e. the following will be bound at module start up:
let x = 1
let y = "my string"
In C# I could create a string representation of an object graph fairly easily with expression trees.
public static string GetGraph<TModel, T>(TModel model, Expression<Func<TModel, T>> action) where TModel : class
{
var method = action.Body as MethodCallExpression;
var body = method != null ? method.Object != null ? method.Object as MemberExpression : method.Arguments.Any() ? method.Arguments.First() as MemberExpression : null : action.Body as MemberExpression;
if (body != null)
{
string graph = GetObjectGraph(body, typeof(TModel))
return graph;
}
throw new Exception("Could not create object graph");
}
In F# I've been looking at Quotations to attempt to do the same thing, and can't quite figure it out. I've attempted converting the quotation into an Expression using the PowerPack libraries, but have had no luck so far, and the information on the internet seems fairly sparse on this topic.
If the input is:
let result = getGraph myObject <# myObject.MyProperty #>
the output should be "myobject.MyProperty"
You can see what you get from quotation expression in fsi session:
> let v = "abc"
> <# v.Length #>;;
val it : Expr<int>
= PropGet (Some (PropGet (None, System.String v, [])), Int32 Length, [])
> <# "abc".Length #>;;
val it : Expr<int>
= PropGet (Some (Value ("abc")), Int32 Length, [])
You can find description of all active patterns available to parse qoutations into
manual\FSharp.Core\Microsoft.FSharp.Quotations.Patterns.html
under your F# installation directory or at msdn site
There is nice Chris Smith's book "Programming F#" with chapter named "Quotations" :)
So, after all, just try to write simple parser:
open Microsoft.FSharp.Quotations
open Microsoft.FSharp.Quotations.Patterns
open Microsoft.FSharp.Quotations.DerivedPatterns
let rec getGraph (expr: Expr) =
let parse args =
List.fold_left (fun acc v -> acc ^ (if acc.Length > 0 then "," else "") ^ getGraph v) "" args
let descr s = function
| Some v -> "(* instance " ^ s ^ "*) " ^ getGraph v
| _ -> "(* static " ^ s ^ "*)"
match expr with
| Int32 i -> string i
| String s -> sprintf "\"%s\"" s
| Value (o,t) -> sprintf "%A" o
| Call (e, methodInfo, av) ->
sprintf "%s.%s(%s)" (descr "method" e) methodInfo.Name (parse av)
| PropGet(e, methodInfo, av) ->
sprintf "%s.%s(%s)" (descr "property" e) methodInfo.Name (parse av)
| _ -> failwithf "I'm don't understand such expression's form yet: %A" expr
P.S. And of course you will need some code to translate AST to human readable format.
I'm unsure what the state of things was back when you asked this question, but today you can convert an F# Quotation to an Expression using the PowerPack like so:
<# "asdf".Length #>.ToLinqExpression()
Also, I've been developing a library Unquote which is able to decompile many F# Quotations into F# single-line non-light syntax code. It can easily handle simple instance PropertyGet expressions like your required input / output:
> decompile <# "asdf".Length #>;;
val it : string = ""asdf".Length"
See my answer to a similar question for more information or just visit Unquote's home page.