Use FParsec to parse float or int*float - f#

I've just started out playing around with FParsec, and I'm now trying to parse strings on the following format
10*0.5 0.25 0.75 3*0.1 0.9
I want 3*0.1, for example, to be expanded into 0.1 0.1 0.1
What I have so far is the following
type UserState = unit
type Parser<'t> = Parser<'t, UserState>
let str s : Parser<_> = pstring s
let float_ws : Parser<_> = pfloat .>> spaces
let product = pipe2 pint32 (str "*" >>. float_ws) (fun x y -> List.init x (fun i -> y))
The product parser correctly parsers entries on the format int*float and expands it into a list of floats. However, I'm having trouble coming up with a solution that allows me to parse either int*float or just a float. I would like to do something like
many (product <|> float_ws)
This will of course not work since the return types of the parsers differ. Any ideas on how to make this work? Is it possible to wrap of modify float_ws such that it returns a list with only one float?

You can make float_ws return a float list by simply adding a |>> List.singleton
let float_ws : Parser<_> = pfloat .>> spaces |>> List.singleton
|>> is just the map function, where you apply some function to the result of one parser and receive a new parser of some new type:
val (|>>): Parser<'a,'u> -> ('a -> 'b) -> Parser<'b,'u>
See: http://www.quanttec.com/fparsec/reference/primitives.html#members.:124::62::62:
Also, since product parser includes an int parser, it will successfully parse a character from the wrong case, this means the parser state will be changed. That means you cannot use the <|> operator on the first parser directly, you must also add attempt so FParsec can return to the original parser state.
let combined = many (attempt product <|> float_ws)

Related

How to parse recusrive grammar in FParsec

Previous questions which I could not use to get this to work
Recursive grammars in FParsec
Seems to be an old question which was asked before createParserForwardedToRef was added to FParsec
AST doesn't seem to be as horribly recursive as mine.
Parsing in to a recursive data structure
Grammar relies on a special character '[' to indicate another nesting level. I don't have this luxury
I want to build a sort of Lexer and project system for a language I have found myself writing lately. The language is called q. It is a fairly simple language and has no operator precedence. For example 1*2+3 is the same as (1*(2+3)). It works a bit like a reverse polish notation calculator, evaluation is right to left.
I am having trouble expressing this in FParsec. I have put together the following simplified demo
open FParsec
type BinaryOperator = BinaryOperator of string
type Number = Number of string
type Element =
|Number of Number
and Expression =
|Element of Element
|BinaryExpression of Element * BinaryOperator * Expression
let number = regex "\d+\.?\d*" |>> Number.Number
let element = [ number ] |> choice |>> Element.Number
let binaryOperator = ["+"; "-"; "*"; "%"] |> Seq.map pstring |> choice |>> BinaryOperator
let binaryExpression expression = pipe3 element binaryOperator expression (fun l o r -> (l,o,r))
let expression =
let exprDummy, expRef = createParserForwardedToRef()
let elemExpr = element |>> Element
let binExpr = binaryExpression exprDummy |>> BinaryExpression
expRef.Value <- [binExpr; elemExpr; ] |> choice
expRef
let statement = expression.Value .>> eof
let parseString s =
printfn "Parsing input: '%s'" s
match run statement s with
| Success(result, _, _) -> printfn "Ok: %A" result
| Failure(errorMsg, _, _) -> printfn "Error: %A" errorMsg
//tests
parseString "1.23"
parseString "1+1"
parseString "1*2+3" // equivalent to (1*(2+3))
So far, I haven't been able to come up with a way to satisfy all 3 tests cases. In the above, it tries to parse binExpr first, realises it can't, but then must be consuming the input because it doesn't try to evaluate elemExpr next. Not sure what to do. How do I satisfy the 3 tests?
Meditating on Tomas' answer, I have come up with the following that works
let expr, expRef = createParserForwardedToRef()
let binRightExpr = binaryOperator .>>. expr
expRef.Value <- parse{
let! first = element
return! choice [
binRightExpr |>> (fun (o, r) -> (first, o, r) |> BinaryExpression)
preturn (first |> Element)
]
}
let statement = expRef.Value .>> eof
The reason the first parser failed is given in the FParsec docs
The behaviour of the <|> combinator has two important characteristics:
<|> only tries the parser on the right side if the parser on the left
side fails. It does not implement a longest match rule.
However, it only tries the right parser if the left parser fails without consuming input.
Probably need to clean up a few things like the structure of the AST but I think I am good to go.

With FParsec how would I parse: line ending in newline <|> a line ending with eof

I'm parsing a file and want to throw away certain lines of the file I'm not interested in. I've been able to get this to work for all cases except for when the last line is a throwaway and does not end in newline.
I've tried constructing an endOfInput rule and joining it with a skipLine rule via <|>. This is all wrapped in a many. Tweaking everything I seem to either get a 'many succeeds without consuming input...' error or a fail on the skipLine rule when I don't try some kind of back track.
let skipLine = many (noneOf "\n") .>> newline |>> fun x -> [string x]
let endOfInput = many (noneOf "\n") .>> eof |>> fun x -> [string x]
test (many (skipLine <|> endOfInput)) "And here is the next.\nThen the last."
** this errors out on the skipLine parser at the last line
I've tried
let skipLine = many (noneOf "\n") .>>? newline |>> fun x -> [string x]
... and ...
let skipLine = many (noneOf "\n") .>> newline |>> fun x -> [string x]
test (many (attempt skipLine <|> endOfInput)) "And here is the next.\nThen the last."
** these produce the many error
Note: the output functions are just place holders to get these to work with my other rules. I haven't gotten into figuring out how to format the output.
This is my first time using FParsec and I'm new to F#.
FParsec actually has a built-in parser that does exactly what you're looking for: skipRestOfLine. It terminates on either newlines or eof, just like what you're looking for.
If you want to try to implement it yourself as a learning exercise, let me know and I'll try to help you figure out the problem. But if you just want a parser that skips characters until the end of the line, the built-in skipRestOfLine is exactly what you need.
Here's an approach of parsing such a files with using an Option type,
it'll help you to parse files with newlines in the end or skip blank lines in the middle. I've got the solution from that post - fparsec key-value parser fails to parse . Parsing of a text file with integer values in one column:
module OptionIntParser =
open FParsec
open System
open System.IO
let pCell: Parser<int, unit> = pint32 |>> fun x -> x
let pSome = pCell |>> Some
let pNone = (restOfLine false) >>% None
let pLine = (attempt pSome) <|> pNone
let pAllover = sepBy pLine newline |>> List.choose id
let readFile filePath =
let rr = File.OpenRead(filePath)
use reader = new IO.StreamReader(rr)
reader.ReadToEnd()
let testStr = readFile("./test1.txt")
let runAll s =
let res = run pAllover s in
match res with
| Success (rows, _, _) -> rows
| Failure (s, _, _) -> []
let myTest =
let res = runAll testStr
res |> List.iter (fun (x) -> Console.WriteLine(x.ToString() ))

Parsing the arrow type with FParsec

I'm trying to parse the arrow type with FParsec.
That is, this:
Int -> Int -> Int -> Float -> Char
For example.
I tried with this code, but it only works for one type of arrow (Int -> Int) and no more. I also want to avoid parentheses, because I already have a tuple type that uses them, and I don't want it to be too heavy in terms of syntax either.
let ws = pspaces >>. many pspaces |>> (fun _ -> ())
let str_ws s = pstring s .>> ws
type Type = ArrowType of Type * Type
let arrowtype' =
pipe2
(ws >>. ty')
(ws >>. str_ws "->" >>. ws >>. ty')
(fun t1 t2 -> ArrowType(t1, t2))
let arrowtype =
pipe2
(ws >>. ty' <|> arrowtype')
(ws >>. str_ws "->" >>. ws >>. ty' <|> arrowtype')
(fun t1 t2 -> ArrowType(t1, t2)) <?> "arrow type"
ty' is just another types, like tuple or identifier.
Do you have a solution?
Before I get into the arrow syntax, I want to comment on your ws parser. Using |>> (fun _ -> ()) is a little inefficient since FParsec has to construct a result object then immediately throw it away. The built-in spaces and spaces1 parsers are probably better for your needs, since they don't need to construct a result object.
Now as for the issue you're struggling with, it looks to me like you want to consider the arrow parser slightly differently. What about treating it as a series of types separated by ->, and using the sepBy family of parser combinators? Something like this:
let arrow = spaces1 >>. pstring "->" .>> spaces1
let arrowlist = sepBy1 ty' arrow
let arrowtype = arrowlist |>> (fun types ->
types |> List.reduce (fun ty1 ty2 -> ArrowType(ty1, ty2))
Note that the arrowlist parser would also match against just plain Int, because the definition of sepBy1 is not "there must be at least one list separator", but rather "there must be at least one item in the list". So to distinguish between a type of Int and an arrow type, you'd want to do something like:
let typeAlone = ty' .>> notFollowedBy arrow
let typeOrArrow = attempt typeAlone <|> arrowtype
The use of attempt is necessary here so that the characters consumed by ty' will be backtracked if an arrow was present.
There's a complicating factor I haven't addressed at all since you mentioned not wanting parentheses. But if you decide that you want to be able to have arrow types of arrow types (that is, functions that take functions as input), you'd want to parse types like (Int -> Int) -> (Int -> Float) -> Char. This would complicate the use of sepBy, and I haven't addressed it at all. If you end up needing more complex parsing including parentheses, then it's possible you might want to use OperatorPrecedenceParser. But for your simple needs where parentheses aren't involved, sepBy1 looks like your best bet.
Finally, I should give a WARNING: I haven't tested this at all, just typed this into the Stack Overflow box. The code example I gave you is not intended to be working as-is, but rather to give you an idea of how to proceed. If you need a working-as-is example, I'll be happy to try to give you one, but I don't have the time to do so right now.

Parsing in to a recursive data structure

I wish to parse a string in to a recursive data structure using F#. In this question I'm going to present a simplified example that cuts to the core of what I want to do.
I want to parse a string of nested square brackets in to the record type:
type Bracket = | Bracket of Bracket option
So:
"[]" -> Bracket None
"[[]]" -> Bracket ( Some ( Bracket None) )
"[[[]]]" -> Bracket ( Some ( Bracket ( Some ( Bracket None) ) ) )
I would like to do this using the parser combinators in the FParsec library. Here is what I have so far:
let tryP parser =
parser |>> Some
<|>
preturn None
/// Parses up to nesting level of 3
let parseBrakets : Parser<_> =
let mostInnerLevelBracket =
pchar '['
.>> pchar ']'
|>> fun _ -> Bracket None
let secondLevelBracket =
pchar '['
>>. tryP mostInnerLevelBracket
.>> pchar ']'
|>> Bracket
let firstLevelBracket =
pchar '['
>>. tryP secondLevelBracket
.>> pchar ']'
|>> Bracket
firstLevelBracket
I even have some Expecto tests:
open Expecto
[<Tests>]
let parserTests =
[ "[]", Bracket None
"[[]]", Bracket (Some (Bracket None))
"[[[]]]", Bracket ( Some (Bracket (Some (Bracket None)))) ]
|> List.map(fun (str, expected) ->
str
|> sprintf "Trying to parse %s"
|> testCase
<| fun _ ->
match run parseBrakets str with
| Success (x, _,_) -> Expect.equal x expected "These should have been equal"
| Failure (m, _,_) -> failwithf "Expected a match: %s" m
)
|> testList "Bracket tests"
let tests =
[ parserTests ]
|> testList "Tests"
runTests defaultConfig tests
The problem is of course how to handle and arbitrary level of nesting - the code above only works for up to 3 levels. The code I would like to write is:
let rec pNestedBracket =
pchar '['
>>. tryP pNestedBracket
.>> pchar ']'
|>> Bracket
But F# doesn't allow this.
Am I barking up the wrong tree completely with how to solve this (I understand that there are easier ways to solve this particular problem)?
You are looking for FParsecs createParserForwardedToRef method. Because parsers are values and not functions it is impossible to make mutually recursive or self recursive parsers in order to do this you have to in a sense declare a parser before you define it.
Your final code will end up looking something like this
let bracketParser, bracketParserRef = createParserForwardedToRef<Bracket>()
bracketParserRef := ... //here you can finally declare your parser
//you can reference bracketParser which is a parser that uses the bracketParserRef
Also I would recommend this article for basic understanding of parser combinators. https://fsharpforfunandprofit.com/posts/understanding-parser-combinators/. The final section on a JSON parser talks about the createParserForwardedToRef method.
As an example of how to use createParserForwardedToRef, here's a snippet from a small parser I wrote recently. It parses lists of space-separated integers between brackets (and the lists can be nested), and the "integers" can be small arithmetic expressions like 1+2 or 3*5.
type ListItem =
| Int of int
| List of ListItem list
let pexpr = // ... omitted for brevity
let plist,plistImpl = createParserForwardedToRef()
let pListContents = (many1 (plist |>> List .>> spaces)) <|>
(many (pexpr |>> Int .>> spaces))
plistImpl := pchar '[' >>. spaces
>>. pListContents
.>> pchar ']'
P.S. I would have put this as a comment to Thomas Devries's answer, but a comment can't contain nicely-formatted code. Go ahead and accept his answer; mine is just intended to flesh his out.

Parsing into a complex type

I'm so new at F# and FParsec, I don't even want to embarrass myself by showing what I've got so far.
In the FParsec examples, every type in the ASTs (that I see) are type abbreviations for single values, lists, or tuples.
What if I have a complex type which is supposed to hold, say, a parsed function name and its parameters?
So, f(a, b, c) would be parsed to an object of type PFunction which has a string member Name and a PParameter list member Parameters. How can I go from a parser which can match f(a, b, c) and |>> it into a PFunction?
All I seem to be able to do so far is create the composite parser, but not turn it into anything. The Calculator example would be similar if it made an AST including a type like Term but instead it seems to me to be an interpreter rather than a parser, so there is no AST. Besides, Term would probably just be a tuple of other type abbreviated components.
Thanks!
I think this is what you're looking for:
let pIdentifier o =
let isIdentifierFirstChar c = isLetter c || c = '_'
let isIdentifierChar c = isLetter c || isDigit c || c = '_'
many1Satisfy2L isIdentifierFirstChar isIdentifierChar "identifier" <| o
let pParameterList p =
spaces >>.
pchar '(' >>. spaces >>. sepBy (spaces >>. p .>> spaces) (pchar ',')
.>> spaces .>> pchar ')'
type FunctionCall(Name: string, Parameters: string list) =
member this.Name = Name
member this.Parameters = Parameters
let pFunctionCall o=
pipe2 (pIdentifier) (pParameterList pIdentifier) (fun name parameters -> FunctionCall(name, parameters)) <|o
This is a completely contrived but here's what I think it would look like the following, using pipe2 instead of |>>
type FunctionCall(Name: string, Parameters: string list) =
member this.Name = Name
member this.Parameters = Parameters
let pFunctionCall =
pipe2 (pIdentifier) (pstring "(" >>. pParameterList .>> pstring ")") (fun name parameters -> FunctionCall(name, parameters))
The functional answer would be to use a discriminated union, as Daniel mentioned. FParsec also has a UserState that can be used like a state monad, so if you really want to parse directly into a complex type, you can use that. [1]
[1] http://cs.hubfs.net/topic/None/60071

Resources