I'm trying to match an integer expression against character literals, and the compiler complains about type mismatch.
let rec read file includepath =
let ch = ref 0
let token = ref 0
use stream = File.OpenText file
let readch() =
ch := stream.Read()
let lex() =
match !ch with
| '!' ->
readch()
| _ -> token := !ch
ch has to be an int because that's what stream.Read returns in order to use -1 as end of file marker. If I replace '!' with int '!' it still doesn't work. What's the best way to do this?
open System.IO
let rec read file includepath =
let ch = ref '0'
let token = ref '0'
use stream = File.OpenText file
let readch() =
let val = stream.Read();
if val = -1 then xxx
else
ch := (char)(val)
xxx
let lex() =
match !ch with
| '!' ->
readch()
| _ -> token := !ch
0
better style:
let rec read file includepath =
use stream = File.OpenText file
let getch() =
let ch = stream.Read()
if ch = -1 then None
else Some(char ch)
let rec getToken() =
match getch() with
| Some ch ->
if ch = '!' then getToken()
else ch
| None ->
failwith "no more chars" //(use your own excepiton)
The F# language does not have implicit conversation between types as they break compositional (i.e. if you move an operation it changes it's mean as there will no longer be an implicit conversion). You can use the char operator to change the int returned by the stream to a char:
open System.IO
let rec read file includepath =
let ch = ref 0
let token = ref 0
use stream = File.OpenText file
let readch() =
ch := stream.Read()
let lex() =
match char !ch with
| '!' ->
readch()
| _ -> token := !ch
lex()
Related
I'm transforming some OCaml code to F# having a problem with the OCaml let...and... which only exists in F# by using a recursive function.
I have the given OCaml code:
let matches s = let chars = explode s in fun c -> mem c chars
let space = matches " \t\n\r"
and punctuiation = matches "() [] {},"
and symbolic = matches "~'!##$%^&*-+=|\\:;<>.?/"
and numeric = matches "0123456789"
and alphanumeric = matches "abcdefghijklmopqrstuvwxyz_'ABCDEFGHIJKLMNOPQRSTUVWXYZ"
which I want to use in these two methods:
let rec lexwhile prop inp = match inp with
c::cs when prop c -> let tok,rest = lexwhile prop cs in c+tok,rest
|_->"",inp
let rec lex inp =
match snd(lexwhile space inp)with
[]->[]
|c::cs ->let prop = if alphanumeric(c) then alphanumeric
else if symbolic(c) then symbolic
else fun c ->false in
let toktl,rest = lexwhile prop cs in
(c+toktl)::lex rest
Has someone any idea how I have to change it that so that I can use it?
It looks like you are trying to translate "Handbook of Practical Logic and Automated Reasoning".
Did you see: An F# version of the book code is now available! Thanks to Eric Taucher, Jack Pappas and Anh-Dung Phan.
You need to look at intro.fs
// pg. 17
// ------------------------------------------------------------------------- //
// Lexical analysis. //
// ------------------------------------------------------------------------- //
let matches s =
let chars =
explode s
fun c -> mem c chars
let space = matches " \t\n\r"
let punctuation = matches "()[]{},"
let symbolic = matches "~`!##$%^&*-+=|\\:;<>.?/"
let numeric = matches "0123456789"
let alphanumeric = matches "abcdefghijklmnopqrstuvwxyz_'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
let rec lexwhile prop inp =
match inp with
| c :: cs when prop c ->
let tok, rest = lexwhile prop cs
c + tok, rest
| _ -> "", inp
let rec lex inp =
match snd <| lexwhile space inp with
| [] -> []
| c :: cs ->
let prop =
if alphanumeric c then alphanumeric
else if symbolic c then symbolic
else fun c -> false
let toktl, rest = lexwhile prop cs
(c + toktl) :: lex rest
I asked many questions here while working on the translation and prefixed them with Converting OCaml to F#:. If you look in the comments you will see how the three of us came to work on this project.
You can just write:
let explode s = [for c in s -> string c]
let matches str strc =
let eStr = explode str
List.contains strc eStr
let space = matches " \t\n\r"
let punctuation = matches "() [] {},"
let symbolic = matches "~'!##$%^&*-+=|\\:;<>.?/"
let numeric = matches "0123456789"
let alphanumeric = matches "abcdefghijklmopqrstuvwxyz_'ABCDEFGHIJKLMNOPQRSTUVWXYZ"
F# tends to be written using lightweight, rather than verbose, syntax so you don't generally need to use the in, begin and end keywords. See: https://msdn.microsoft.com/en-us/library/dd233199.aspx for details about the differences.
Personally, I would probably refactor all of those string -> bool functions into active patterns, e.g.:
let (|Alphanumeric|_|) str =
match matches "abcdefghijklmopqrstuvwxyz_'ABCDEFGHIJKLMNOPQRSTUVWXYZ" str with
|true -> Some str
|false -> None
let (|Symbolic|_|) str =
match matches "~'!##$%^&*-+=|\\:;<>.?/" str with
|true -> Some str
|false -> None
Then you can pattern match, e.g.:
match c with
|Alphanumeric _ -> // alphanumeric case
|Symbolic _ -> // symbolic case
|_ -> // other cases
I'm working on a program that iterates over an input file, with a variable number of 'programs', and ending in '0'. My function run works fine if I start it from the top of the file, but for some reason a line is consumed by peeking to see if the next char is '0' (indicating the end of the file).
Here's my code:
let line_stream_of_channel channel =
Stream.from
(fun _ ->
try Some (input_line channel) with End_of_file -> None);;
let in_channel = open_in "dull.in" in
let line_stream = line_stream_of_channel in_channel in
while Stream.peek line_stream != Some "0" do
run in_channel;
print_string "...\n";
done;;
From what I've read, Stream.peek shouldn't consume a line, so maybe the problem doesn't come from that, but if not, I can't figure out what's doing it. Any ideas?
Edit Here's the entirety of my program:
let hello c =
print_char c;;
let hello_int c =
print_int c;
print_char '\n';;
let ios = int_of_string;;
let rec print_string_list = function
[] -> print_string "\n"
| h::t -> print_string h ; print_string " " ; print_string_list t;;
let rec print_int_list = function
[] -> print_string "\n"
| h::t -> print_int h ; print_string " " ; print_int_list t;;
let rec append l i =
match l with
[] -> [i]
| h :: t -> h :: (append t i);;
let line_stream_of_channel channel =
Stream.from
(fun _ ->
try Some (input_line channel) with End_of_file -> None);;
let string_to_int_list str_list int_list=
let len = List.length str_list in
for i = 0 to len - 1 do
int_list := append !int_list (ios (List.nth str_list i));
done;;
let get_option = function
| Some x -> x
| None -> raise (Invalid_argument "Option.get");;
let chomp_line ns in_channel =
let s = input_line in_channel in
let len = String.length s in
let start_pos = ref 0 in
for i = 0 to len do
if i == len then
let word = String.sub s !start_pos (i - !start_pos) in
ns := append !ns word;
else if s.[i] == ' ' then
let word = String.sub s !start_pos (i - !start_pos) in
ns := append !ns word;
start_pos := i + 1;
done;;
let run in_channel =
let ns = ref [] in
chomp_line ns in_channel;
let n = ios (List.nth !ns 0) in
let p = ios (List.nth !ns 1) in
let s = ios (List.nth !ns 2) in
print_string "num dulls: "; hello_int n;
print_string "num programs: "; hello_int p;
print_string "num state transitions: "; hello_int s;
let dull_sizes = ref [] in
chomp_line dull_sizes in_channel;
let int_dull_sizes = ref [] in
string_to_int_list !dull_sizes int_dull_sizes;
print_string "size of dulls: "; print_int_list !int_dull_sizes;
let program_sizes = ref [] in
let program_dulls = ref [] in
for i = 0 to p - 1 do
let program = ref [] in
chomp_line program in_channel;
program_sizes := append !program_sizes (List.nth !program 0);
program_dulls := append !program_dulls (List.nth !program 1);
done;
let int_program_sizes = ref [] in
string_to_int_list !program_sizes int_program_sizes;
print_string "program sizes: "; print_int_list !int_program_sizes;
print_string "program dulls: "; print_string_list !program_dulls;
let transitions = ref [] in
chomp_line transitions in_channel;
let int_transitions = ref [] in
string_to_int_list !transitions int_transitions;
for i = 0 to s - 1 do
hello_int (List.nth !int_transitions i)
done
;;
let in_channel = open_in "dull.in" in
let line_stream = line_stream_of_channel in_channel in
while Stream.peek line_stream <> Some "0" do
run in_channel;
done;;
And here's a sample input:
2 2 3
500 600
100 A
200 B
2 1 2
5 4 8
100 400 200 500 300
250 AC
360 ACE
120 AB
40 DE
2 3 4 -3 1 2 -2 1
0
(!=) is physical (pointer) inequality, and the test fails to detect your end mark 0. When 0 is peeked, Stream.peek returns Some 0, but it is a different entity from Some 0 of the right hand of the inequality check, and therefore the loop never terminates until it crashes at EOF.
The following demonstrates what is happening:
# Some 0 != Some 0;;
- : bool = true
# let x = Some 0 in x != x;;
- : bool = false
Use (<>), structural inequality here. Except it and the omitted run_in_channel part, the code works fine for me.
A golden rule: do not use physical equality (==) and (!=) unless you really need them. Normally, stick to structural equalities (=) and (<>).
-- edit --
There was another issue in the code which was not originally revealed.
Once you create a stream from an in_channel. Do not touch it by yourself, until you want to close it by close_in! Let the stream the only reader of it.
The benefit of the stream is that once created, you are freed from taking care of when the actual readings happen. You could still access the channel directly, but it just ruins the benefit completely. Just do not do it. Use Stream.next or Stream.peek instead of input_line in your run.
Is there a way to use F#'s sprintf float formating with a decimal comma? It would be nice if this worked:
sprintf "%,1f" 23.456
// expected: "23,456"
Or can I only use String.Format Method (IFormatProvider, String, Object()) ?
EDIT: I would like to have a comma not a point as a decimal separator. Like most non-English speaking countries use it.
It's quite a pain, but you can write your own version of sprintf that does exactly what you want:
open System
open System.Text.RegularExpressions
open System.Linq.Expressions
let printfRegex = Regex(#"^(?<text>[^%]*)((?<placeholder>%(%|((0|-|\+| )?([0-9]+)?(\.[0-9]+)?b|c|s|d|i|u|x|X|o|e|E|f|F|g|G|M|O|A|\+A|a|t)))(?<text>[^%]*))*$", RegexOptions.ExplicitCapture ||| RegexOptions.Compiled)
type PrintfExpr =
| K of Expression
| F of ParameterExpression * Expression
let sprintf' (c:System.Globalization.CultureInfo) (f:Printf.StringFormat<'a>) : 'a =
//'a has form 't1 -> 't2 -> ... -> string
let cultureExpr = Expression.Constant(c) :> Expression
let m = printfRegex.Match(f.Value)
let prefix = m.Groups.["text"].Captures.[0].Value
let inputTypes =
let rec loop t =
if Reflection.FSharpType.IsFunction t then
let dom, rng = Reflection.FSharpType.GetFunctionElements t
dom :: loop rng
else
if t <> typeof<string> then
failwithf "Unexpected return type: %A" t
[]
ref(loop typeof<'a>)
let pop() =
let (t::ts) = !inputTypes
inputTypes := ts
t
let exprs =
K(Expression.Constant(prefix)) ::
[for i in 0 .. m.Groups.["placeholder"].Captures.Count - 1 do
let ph = m.Groups.["placeholder"].Captures.[i].Value
let text = m.Groups.["text"].Captures.[i+1].Value
// TODO: handle flags, width, precision, other placeholder types, etc.
if ph = "%%" then yield K(Expression.Constant("%" + text))
else
match ph with
| "%f" ->
let t = pop()
if t <> typeof<float> && t <> typeof<float32> then
failwithf "Unexpected type for %%f placeholder: %A" t
let e = Expression.Variable t
yield F(e, Expression.Call(e, t.GetMethod("ToString", [| typeof<System.Globalization.CultureInfo> |]), [cultureExpr]))
| "%s" ->
let t = pop()
if t <> typeof<string> then
failwithf "Unexpected type for %%s placeholder: %A" t
let e = Expression.Variable t
yield F(e, e)
| _ ->
failwithf "unhandled placeholder: %s" ph
yield K (Expression.Constant text)]
let innerExpr =
Expression.Call(typeof<string>.GetMethod("Concat", [|typeof<string[]>|]), Expression.NewArrayInit(typeof<string>, exprs |> Seq.map (fun (K e | F(_,e)) -> e)))
:> Expression
let funcConvert =
typeof<FuncConvert>.GetMethods()
|> Seq.find (fun mi -> mi.Name = "ToFSharpFunc" && mi.GetParameters().[0].ParameterType.GetGenericTypeDefinition() = typedefof<Converter<_,_>>)
let body =
List.foldBack (fun pe (e:Expression) ->
match pe with
| K _ -> e
| F(p,_) ->
let m = funcConvert.MakeGenericMethod(p.Type, e.Type)
Expression.Call(m, Expression.Lambda(m.GetParameters().[0].ParameterType, e, p))
:> Expression) exprs innerExpr
Expression.Lambda(body, [||]).Compile().DynamicInvoke() :?> 'a
sprintf' (Globalization.CultureInfo.GetCultureInfo "fr-FR") "%s %f > %f" "It worked!" 1.5f -12.3
Taking a look at source code of Printf module, it uses invariantCulture. I don't think printf-like functions are culture aware.
If you always need a comma, you could use sprintf and string.Replace function. If your code is culture-dependent, using ToString or String.Format is your best bet.
I've started learning FParsec. It has a very flexible way to parse numbers; I can provide a set of number formats I want to use:
type Number =
| Numeral of int
| Decimal of float
| Hexadecimal of int
| Binary of int
let numberFormat = NumberLiteralOptions.AllowFraction
||| NumberLiteralOptions.AllowHexadecimal
||| NumberLiteralOptions.AllowBinary
let pnumber =
numberLiteral numberFormat "number"
|>> fun num -> if num.IsHexadecimal then Hexadecimal (int num.String)
elif num.IsBinary then Binary (int num.String)
elif num.IsInteger then Numeral (int num.String)
else Decimal (float num.String)
However, the language I'm trying to parse is a bit strange. A number could be numeral (non-negative int), decimal (non-negative float), hexadecimal (with prefix #x) or binary (with prefix #b):
numeral: 0, 2
decimal: 0.2, 2.0
hexadecimal: #xA04, #x611ff
binary: #b100, #b001
Right now I have to do parsing twice by substituting # by 0 (if necessary) to make use of pnumber:
let number: Parser<_, unit> =
let isDotOrDigit c = isDigit c || c = '.'
let numOrDec = many1Satisfy2 isDigit isDotOrDigit
let hexOrBin = skipChar '#' >>. manyChars (letter <|> digit) |>> sprintf "0%s"
let str = spaces >>. numOrDec <|> hexOrBin
str |>> fun s -> match run pnumber s with
| Success(result, _, _) -> result
| Failure(errorMsg, _, _) -> failwith errorMsg
What is a better way of parsing in this case? Or how can I alter FParsec's CharStream to be able to make conditional parsing easier?
Parsing numbers can be pretty messy if you want to generate good error messages and properly check for overflows.
The following is a simple FParsec implementation of your number parser:
let numeralOrDecimal : Parser<_, unit> =
// note: doesn't parse a float exponent suffix
numberLiteral NumberLiteralOptions.AllowFraction "number"
|>> fun num ->
// raises an exception on overflow
if num.IsInteger then Numeral(int num.String)
else Decimal(float num.String)
let hexNumber =
pstring "#x" >>. many1SatisfyL isHex "hex digit"
|>> fun hexStr ->
// raises an exception on overflow
Hexadecimal(System.Convert.ToInt32(hexStr, 16))
let binaryNumber =
pstring "#b" >>. many1SatisfyL (fun c -> c = '0' || c = '1') "binary digit"
|>> fun hexStr ->
// raises an exception on overflow
Binary(System.Convert.ToInt32(hexStr, 2))
let number =
choiceL [numeralOrDecimal
hexNumber
binaryNumber]
"number literal"
Generating good error messages on overflows would complicate this implementation a bit, as you would ideally also need to backtrack after the error, so that the error position ends up at the start of the number literal (see the numberLiteral docs for an example).
A simple way to gracefully handle possible overflow exception is to use a little exception handling combinator like the following:
let mayThrow (p: Parser<'t,'u>) : Parser<'t,'u> =
fun stream ->
let state = stream.State
try
p stream
with e -> // catching all exceptions is somewhat dangerous
stream.BacktrackTo(state)
Reply(FatalError, messageError e.Message)
You could then write
let number = mayThrow (choiceL [...] "number literal")
I'm not sure what you meant to say with "alter FParsec's CharStream to be able to make conditional parsing easier", but the following sample demonstrates how you could write a low-level implementation that only uses the CharStream methods directly.
type NumberStyles = System.Globalization.NumberStyles
let invariantCulture = System.Globalization.CultureInfo.InvariantCulture
let number: Parser<Number, unit> =
let expectedNumber = expected "number"
let inline isBinary c = c = '0' || c = '1'
let inline hex2int c = (int c &&& 15) + (int c >>> 6)*9
let hexStringToInt (str: string) = // does no argument or overflow checking
let mutable n = 0
for c in str do
n <- n*16 + hex2int c
n
let binStringToInt (str: string) = // does no argument or overflow checking
let mutable n = 0
for c in str do
n <- n*2 + (int c - int '0')
n
let findIndexOfFirstNonNull (str: string) =
let mutable i = 0
while i < str.Length && str.[i] = '0' do
i <- i + 1
i
let isHexFun = id isHex // tricks the compiler into caching the function object
let isDigitFun = id isDigit
let isBinaryFun = id isBinary
fun stream ->
let start = stream.IndexToken
let cs = stream.Peek2()
match cs.Char0, cs.Char1 with
| '#', 'x' ->
stream.Skip(2)
let str = stream.ReadCharsOrNewlinesWhile(isHexFun, false)
if str.Length <> 0 then
let i = findIndexOfFirstNonNull str
let length = str.Length - i
if length < 8 || (length = 8 && str.[i] <= '7') then
Reply(Hexadecimal(hexStringToInt str))
else
stream.Seek(start)
Reply(Error, messageError "hex number literal is too large for 32-bit int")
else
Reply(Error, expected "hex digit")
| '#', 'b' ->
stream.Skip(2)
let str = stream.ReadCharsOrNewlinesWhile(isBinaryFun, false)
if str.Length <> 0 then
let i = findIndexOfFirstNonNull str
let length = str.Length - i
if length < 32 then
Reply(Binary(binStringToInt str))
else
stream.Seek(start)
Reply(Error, messageError "binary number literal is too large for 32-bit int")
else
Reply(Error, expected "binary digit")
| c, _ ->
if not (isDigit c) then Reply(Error, expectedNumber)
else
stream.SkipCharsOrNewlinesWhile(isDigitFun) |> ignore
if stream.Skip('.') then
let n2 = stream.SkipCharsOrNewlinesWhile(isDigitFun)
if n2 <> 0 then
// we don't parse any exponent, as in the other example
let mutable result = 0.
if System.Double.TryParse(stream.ReadFrom(start),
NumberStyles.AllowDecimalPoint,
invariantCulture,
&result)
then Reply(Decimal(result))
else
stream.Seek(start)
Reply(Error, messageError "decimal literal is larger than System.Double.MaxValue")
else
Reply(Error, expected "digit")
else
let decimalString = stream.ReadFrom(start)
let mutable result = 0
if System.Int32.TryParse(stream.ReadFrom(start),
NumberStyles.None,
invariantCulture,
&result)
then Reply(Numeral(result))
else
stream.Seek(start)
Reply(Error, messageError "decimal number literal is too large for 32-bit int")
While this implementation parses hex and binary numbers without the help of system methods, it eventually delegates the parsing of decimal numbers to the Int32.TryParse and Double.TryParse methods.
As I said: it's messy.
Given a list of vowels, I have written the function startsWithVowel to investigate if a word starts with a vowel. As you can see I use exception as controlflow, and that's not ideal. How to implement this better?
let vowel = ['a'; 'e'; 'i'; 'o'; 'u']
let startsWithVowel(str :string) =
try
List.findIndex (fun x -> x = str.[0]) vowel
true
with
| :? System.Collections.Generic.KeyNotFoundException -> false
UPDATE : tx to all : once again I experience : never hesitate to ask a newbee question. I see a lot of very useful remarks, keep them coming :-)
try using the exists method instead
let vowel = ['a'; 'e'; 'i'; 'o'; 'u']
let startsWithVowel(str :string) = List.exists (fun x -> x = str.[0]) vowel
exists returns true if any element in the list returns true for the predicate and false otherwise.
Use sets for efficient lookup
let vowels = Set.ofList ['a'; 'e'; 'i'; 'o'; 'u']
let startsWithVowel(str : string) = vowels |> Set.mem (str.[0])
Yet another alternative, tryFindIndex returns Some or None rather than throwing an exception:
> let vowel = ['A'; 'E'; 'I'; 'O'; 'U'; 'a'; 'e'; 'i'; 'o'; 'u']
let startsWithVowel(str :string) =
match List.tryFindIndex (fun x -> x = str.[0]) vowel with
| Some(_) -> true
| None -> false;;
val vowel : char list = ['A'; 'E'; 'I'; 'O'; 'U'; 'a'; 'e'; 'i'; 'o'; 'u']
val startsWithVowel : string -> bool
> startsWithVowel "Juliet";;
val it : bool = false
> startsWithVowel "Omaha";;
val it : bool = true
I benchmarked a few approaches mentioned in this thread (Edit: added nr. 6).
The List.exists approach (~0.75 seconds)
The Set.contains approach (~0.51 seconds)
String.IndexOf (~0.25 seconds)
A non-compiled regex (~5 - 6 seconds)
A compiled regex (~1.0 seconds)
Pattern matching (why did I forget this the first time?) (~0.17 seconds)
I filled a list with 500000 random words and filtered it through various startsWithVowel functions, repeated 10 times.
Test code:
open System.Text.RegularExpressions
let startsWithVowel1 =
let vowels = ['a';'e';'i';'o';'u']
fun (s:string) -> vowels |> List.exists (fun v -> s.[0] = v)
let startsWithVowel2 =
let vowels = ['a';'e';'i';'o';'u'] |> Set.ofList
fun (s:string) -> Set.contains s.[0] vowels
let startsWithVowel3 (s:string) = "aeiou".IndexOf(s.[0]) >= 0
let startsWithVowel4 str = Regex.IsMatch(str, "^[aeiou]")
let startsWithVowel5 =
let rex = new Regex("^[aeiou]",RegexOptions.Compiled)
fun (s:string) -> rex.IsMatch(s)
let startsWithVowel6 (s:string) =
match s.[0] with
| 'a' | 'e' | 'i' | 'o' | 'u' -> true
| _ -> false
//5x10^5 random words
let gibberish =
let R = new System.Random()
let (word:byte[]) = Array.zeroCreate 5
[for _ in 1..500000 ->
new string ([|for _ in 3..R.Next(4)+3 -> char (R.Next(26)+97)|])
]
//f = startsWithVowelX, use #time in F# interactive for the timing
let test f =
for _ in 1..10 do
gibberish |> List.filter f |> ignore
My humble conclusion:
EDIT:
The imperative IndexOf F# pattern match wins the speed contest.
The Set.contains approach wins the beauty contest.
Note also that a number of exception-throwing functions have non-exception equivalents that return option rather than throwing - these typically have a 'try' prefix in the function name.
List.tryFindIndex:
http://msdn.microsoft.com/en-us/library/ee340224(VS.100).aspx
See also
http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!181.entry
Using regular expressions:
open System.Text.RegularExpressions
let startsWithVowel str = Regex.IsMatch(str, "^[AEIOU]", RegexOptions.IgnoreCase)
let startsWithVowel (word:string) =
let vowels = ['a';'e';'i';'o';'u']
List.exists (fun v -> v = word.[0]) vowels