Parsing int or float with FParsec - f#

I'm trying to parse a file, using FParsec, which consists of either float or int values. I'm facing two problems that I can't find a good solution for.
1
Both pint32 and pfloat will successfully parse the same string, but give different answers, e.g pint32 will return 3 when parsing the string "3.0" and pfloat will return 3.0 when parsing the same string. Is it possible to try parsing a floating point value using pint32 and have it fail if the string is "3.0"?
In other words, is there a way to make the following code work:
let parseFloatOrInt lines =
let rec loop intvalues floatvalues lines =
match lines with
| [] -> floatvalues, intvalues
| line::rest ->
match run floatWs line with
| Success (r, _, _) -> loop intvalues (r::floatvalues) rest
| Failure _ ->
match run intWs line with
| Success (r, _, _) -> loop (r::intvalues) floatvalues rest
| Failure _ -> loop intvalues floatvalues rest
loop [] [] lines
This piece of code will correctly place all floating point values in the floatvalues list, but because pfloat returns "3.0" when parsing the string "3", all integer values will also be placed in the floatvalues list.
2
The above code example seems a bit clumsy to me, so I'm guessing there must be a better way to do it. I considered combining them using choice, however both parsers must return the same type for that to work. I guess I could make a discriminated union with one option for float and one for int and convert the output from pint32 and pfloat using the |>> operator. However, I'm wondering if there is a better solution?

You're on the right path thinking about defining domain data and separating definition of parsers and their usage on source data. This seems to be a good approach, because as your real-life project grows further, you would probably need more data types.
Here's how I would write it:
/// The resulting type, or DSL
type MyData =
| IntValue of int
| FloatValue of float
| Error // special case for all parse failures
// Then, let's define individual parsers:
let pMyInt =
pint32
|>> IntValue
// this is an alternative version of float parser.
// it ensures that the value has non-zero fractional part.
// caveat: the naive approach would treat values like 42.0 as integer
let pMyFloat =
pfloat
>>= (fun x -> if x % 1 = 0 then fail "Not a float" else preturn (FloatValue x))
let pError =
// this parser must consume some input,
// otherwise combined with `many` it would hang in a dead loop
skipAnyChar
>>. preturn Error
// Now, the combined parser:
let pCombined =
[ pMyFloat; pMyInt; pError ] // note, future parsers will be added here;
// mind the order as float supersedes the int,
// and Error must be the last
|> List.map (fun p -> p .>> ws) // I'm too lazy to add whitespase skipping
// into each individual parser
|> List.map attempt // each parser is optional
|> choice // on each iteration, one of the parsers must succeed
|> many // a loop
Note, the code above is capable working with any sources: strings, streams, or whatever. Your real app may need to work with files, but unit testing can be simplified by using just string list.
// Now, applying the parser somewhere in the code:
let maybeParseResult =
match run pCombined myStringData with
| Success(result, _, _) -> Some result
| Failure(_, _, _) -> None // or anything that indicates general parse failure
UPD. I have edited the code according to comments. pMyFloat was updated to ensure that the parsed value has non-zero fractional part.

FParsec has the numberLiteral parser that can be used to solve the problem.
As a start you can use the example available at the link above:
open FParsec
open FParsec.Primitives
open FParsec.CharParsers
type Number = Int of int64
| Float of float
// -?[0-9]+(\.[0-9]*)?([eE][+-]?[0-9]+)?
let numberFormat = NumberLiteralOptions.AllowMinusSign
||| NumberLiteralOptions.AllowFraction
||| NumberLiteralOptions.AllowExponent
let pnumber : Parser<Number, unit> =
numberLiteral numberFormat "number"
|>> fun nl ->
if nl.IsInteger then Int (int64 nl.String)
else Float (float nl.String)```

Related

F# check if a string contains only number

I am trying to figure out a nice way to check if a string contains only number. This is the result of my effort but it seems really verbose:
let isDigit c = Char.IsDigit c
let rec strContainsOnlyNumber (s:string)=
let charList = List.ofSeq s
match charList with
| x :: xs ->
if isDigit x then
strContainsOnlyNumber ( String.Concat (Array.ofList xs))
else
false
| [] -> true
for example it seems really ugly that I have to convert a string to char list and then back to a string.
Can you figure out a better solution?
There are a few different options for approaching this.
Given that System.String is a sequence of characters, which you're currently using to turn into a list, you can skip the list conversions and just use Seq.forall to directly test:
let strContainsOnlyNumber (s:string) = s |> Seq.forall Char.IsDigit
If you want to see if it's a valid number, you can parse it into a number directly:
let strContainsOnlyNumber (s:string) = System.Int32.TryParse s |> fst
Note that this will also return true for things like "-342" (which contains -, but is a valid number).
Another approach would be to use a regular expression:
let numberCheck = System.Text.RegularExpressions.Regex("^[0-9]+$")
let strContainsOnlyNumbers (s:string) = numberCheck.IsMatch s
This will also handle numeric characters, but could be adapted to include other symbols in numbers if needed.
If the goal is to later use the string as a number, my suggestion would be to just do a conversion, and store in an option:
let tryToInt s =
match System.Int32.TryParse s with
| true, v -> Some v
| false, _ -> None
This will allow you to check to see if the value was a number (via Option.isSome), pattern match to use the results, and more.
Note that conversions to floating point numbers is nearly identical - just change the Int32.TryParse to a Double.TryParse if you want to handle float values.

How to add a condition that a parsed number must satisfy in FParsec?

I am trying to parse an int32 with FParsec but have an additional restriction that the number must be less than some maximum value. Is their a way to perform this without writing my own custom parser (as below) and/or is my custom parser (below) the appropriate way of achieving the requirements.
I ask because most of the built-in library functions seem to revolve around a char satisfying certain predicates and not any other type.
let pRow: Parser<int> =
let error = messageError ("int parsed larger than maxRows")
let mutable res = Reply(Error, error)
fun stream ->
let reply = pint32 stream
if reply.Status = Ok && reply.Result <= 1000000 then
res <- reply
res
UPDATE
Below is an attempt at a more fitting FParsec solution based on the direction given in the comment below:
let pRow2: Parser<int> =
pint32 >>= (fun x -> if x <= 1048576 then (preturn x) else fail "int parsed larger than maxRows")
Is this the correct way to do it?
You've done an excellent research and almost answered your own question.
Generally, there are two approaches:
Unconditionally parse out an int and let the further code to check it for validity;
Use a guard rule bound to the parser. In this case (>>=) is the right tool;
In order to make a good choice, ask yourself whether an integer that failed to pass the guard rule has to "give another chance" by triggering another parser?
Here's what I mean. Usually, in real-life projects, parsers are combined in some chains. If one parser fails, the following one is attempted. For example, in this question, some programming language is parsed, so it needs something like:
let pContent =
pLineComment <|> pOperator <|> pNumeral <|> pKeyword <|> pIdentifier
Theoretically, your DSL may need to differentiate a "small int value" from another type:
/// The resulting type, or DSL
type Output =
| SmallValue of int
| LargeValueAndString of int * string
| Comment of string
let pSmallValue =
pint32 >>= (fun x -> if x <= 1048576 then (preturn x) else fail "int parsed larger than maxRows")
|>> SmallValue
let pLargeValueAndString =
pint32 .>> ws .>>. (manyTill ws)
|>> LargeValueAndString
let pComment =
manyTill ws
|>> Comment
let pCombined =
[ pSmallValue; pLargeValueAndString; pComment]
|> List.map attempt // each parser is optional
|> choice // on each iteration, one of the parsers must succeed
|> many // a loop
Built this way, pCombined will return:
"42 ABC" gets parsed as [ SmallValue 42 ; Comment "ABC" ]
"1234567 ABC" gets parsed as [ LargeValueAndString(1234567, "ABC") ]
As we see, the guard rule impacts how the parsers are applied, so the guard rule has to be within the parsing process.
If, however, you don't need such complication (e.g., an int is parsed unconditionally), your first snippet is just fine.

Monadic parse with uu-parsinglib

I'm trying to create a Monadic parser using uu_parsinglib. I thought I had it covered, but I'm getting some unexpected results in testing
A cut down example of my parser is:
pType :: Parser ASTType
pType = addLength 0 $
do (Amb n_list) <- pName
let r_list = filter attributeFilter n_list
case r_list of
(ASTName_IdName a : [] ) -> return (ASTType a)
(ASTName_TypeName a : [] ) -> return (ASTType a)
_ -> pFail
where nameFilter :: ASTName' -> Bool
nameFilter a =
case a of
(ASTName_IDName _) -> True
(ASTName_TypeName _) -> True
_ -> False
data ASTType = ASTType ASTName
data ASTName = Amb [ASTName']
data ASTName' =
ASTName_IDName ASTName
ASTName_TypeName ASTName
ASTName_OtherName ASTName
ASTName_Simple String
pName is an ambiguous parser. What I want type parser to do is apply a post filter, and return all alternatives that satisfy nameFilter, wrapped as ASTType.
If there are none, it should fail.
(I realise the example I've given will fail if there is more than one valid match in the list, but the example serves its purpose)
Now, this all works as far as I can see. The problem lies when you use it in more complicated Grammars, where odd matches seem to occur. What I suspect is the problem is the addLength 0 part
What I would like to do is separate out the monadic and applicative parts. Create a monadic parser with the filtering component, and then apply pName using the <**> operator.
Alternatively
I'd settle for a really good explanation of what addLength is doing.
I've put together a fudge/workaround to use for monadic parsing with uu-parsinglib. The only way I ever use Monadic parsers is to analysis a overly generous initial parser, and selectively fail its results.
bind' :: Parser a -> (a -> Parser b) -> Parser b
bind' a#(P _ _ _ l') b = let (P t nep e _) = (a >>= b) in P t nep e l'
The important thing to remember when using this parser is that
a -> M b
must consume no input. It must either return a transformed version of a, or fail.
WARNING
Testing on this is only minimal currently, and its behaviour is not enforced by type. It is a fudge.

F# Pattern-matching by type

How pattern-matching by type of argument works in F#?
For example I'm trying to write simple program which would calculate square root if number provided or return it's argument otherwise.
open System
let my_sqrt x =
match x with
| :? float as f -> sqrt f
| _ -> x
printfn "Enter x"
let x = Console.ReadLine()
printfn "For x = %A result is %A" x (my_sqrt x)
Console.ReadLine()
I get this error:
error FS0008: This runtime coercion or type test from type
'a
to
float
involves an indeterminate type based on information prior
to this program point. Runtime type tests are not allowed
on some types. Further type annotations are needed.
Since sqrt works with float I check for float type, but guess there could be better solution - like check if input is number (in general) and if so, cast it to float?
The problem here is that the type of x is actually a string. Adding that it comes from Console.ReadLine, what kind of information is stored in that string is only possible to determine at runtime. This means that you can't use neither pattern matching, nor pattern matching with coercion here.
But you can use Active Patterns. As what actual data is stored in x is only known at runtime, you have to parse the string and see what is contains.
So suppose you are expecting a float, but you can't be sure since user can input whatever they want. We are going to try and parse our string:
let my_sqrt x =
let success, v = System.Single.TryParse x // the float in F# is represented by System.Single in .NET
if success then sqrt v
else x
But this won't compile:
This expression was expected to have type float32 but here has type string
The problem is that the compiler inferred the function to return a float32, based on the expression sqrt (System.Single.Parse(x)). But then if the x doesn't parse to float, we intend to just return it, and as x is a string we have an inconsistency here.
To fix this, we will have to convert the result of sqrt to a string:
let my_sqrt x =
let success, v = System.Single.TryParse x
if success then (sqrt v).ToString()
else x
Ok, this should work, but it doesn't use pattern matching. So let's define our "active" pattern, since we can't use regular pattern matching here:
let (|Float|_|) input =
match System.Single.TryParse input with
| true, v -> Some v
| _ -> None
Basically, this pattern will match only if the input can be correctly parsed as a floating point literal. Here's how it can be used in your initial function implementation:
let my_sqrt' x =
match x with
| Float f -> (sqrt f).ToString()
| _ -> x
This looks a lot like your function, but note that I still had to add the .ToString() bit.
Hope this helps.
Just quoting the one and only Scott Wlaschin's 'F# for fun and profit' site:
Matching on subtypes You can match on subtypes, using the :? operator,
which gives you a crude polymorphism:
let x = new Object()
let y =
match x with
| :? System.Int32 ->
printfn "matched an int"
| :? System.DateTime ->
printfn "matched a datetime"
| _ ->
printfn "another type"
This only works to find subclasses of a parent class (in this case,
Object). The overall type of the expression has the parent class as
input.
Note that in some cases, you may need to “box” the value.
let detectType v =
match v with
| :? int -> printfn "this is an int"
| _ -> printfn "something else"
// error FS0008: This runtime coercion or type test from type 'a to int
// involves an indeterminate type based on information prior to this program point.
// Runtime type tests are not allowed on some types. Further type annotations are needed.
The message tells you the problem: “runtime type tests are not allowed
on some types”. The answer is to “box” the value which forces it into
a reference type, and then you can type check it:
let detectTypeBoxed v =
match box v with // used "box v"
| :? int -> printfn "this is an int"
| _ -> printfn "something else"
//test
detectTypeBoxed 1
detectTypeBoxed 3.14
In my opinion, matching and dispatching on types is a code smell, just
as it is in object-oriented programming. It is occasionally necessary,
but used carelessly is an indication of poor design.
In a good object oriented design, the correct approach would be to use
polymorphism to replace the subtype tests, along with techniques such
as double dispatch. So if you are doing this kind of OO in F#, you
should probably use those same techniques.

How do I use Some/None Options in this F# example?

I am new to F# and I have this code:
if s.Contains("-") then
let x,y =
match s.Split [|'-'|] with
| [|a;b|] -> int a, int b
| _ -> 0,0
Notice that we validate that there is a '-' in the string before we split the string, so the match is really unnecessary. Can I rewrite this with Options?
I changed this code, it was originally this (but I was getting a warning):
if s.Contains("-") then
let [|a;b|] = s.Split [|'-'|]
let x,y = int a, int b
NOTE: I am splitting a range of numbers (range is expressed in a string) and then creating the integer values that represent the range's minimum and maximum.
The match is not unnecessary, the string might be "1-2-3" and you'll get a three-element array.
Quit trying to get rid of the match, it is your friend, not your enemy. :) Your enemy is the mistaken attempt at pre-validation (the "if contains" logic, which was wrong).
I think you may enjoy this two-part blog series.
http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!180.entry
http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!181.entry
EDIT
Regarding Some/None comment, yes, you can do
let parseRange (s:string) =
match s.Split [|'-'|] with
| [|a;b|] -> Some(int a, int b)
| _ -> None
let Example s =
match parseRange s with
| Some(lo,hi) -> printfn "%d - %d" lo hi
| None -> printfn "range was bad"
Example "1-2"
Example "1-2-3"
Example "1"
where parseRange return value is a Some (success) or None (failure) and rest of program can make a decision later based on that.

Resources