So I am trying to use f# to find if a string has matching parentheses.
i.e: (abc) returns true, ((hello) returns false, and )( returns false, etc...
What I (think) am doing is using a stack to push when it sees a '(' and pop when it sees a ')' in the list. Then if the string list is empty, either the stack has an item or it doesn't. If it does, then I say that it is invalid, if I come across a ')' and the stack is empty, it is also invalid. Otherwise it is a valid string.
// Break string
let break_string (str:string) =
Seq.toList str
let isBalanced (str:string) =
let lst = break_string str
let stack = []
let rec balance str_lst (stk:'a list)=
match str_lst with
| [] ->
if stk.Length > 0 then
false
else
true
| x::xs ->
if x = '(' then
balance (xs x::stack)
elif x = ')' then
if stack.Length = 0 then
false
else
stack = stack.tail
balance (lst, stack)
I am pretty new to f# so I think this might be doing what I want, however I get the error message:
"This expression was expected to have type bool but here has type 'a list -> bool"
First, what does that actually mean?
Second, since it is returning a bool, why doesn't that work?
The type-checker thinks you forgot the second argument to balance. When you write this:
balance (xs x::stack)
It means "apply balance to (apply xs to x::stack)". You probably meant this:
balance xs (x::stack)
Your final elif also seems to be missing an else branch, and the line stack = stack.tail looks like you're trying to assign to stack. You can't do that since stack is an immutable variable. Likely you want balance xs (List.tail stack). The principle is that instead of assigning to variables, you call your function with new values.
You have the right idea, of course. It can be much more concise by folding all the matching (what's the letter? what's on the stack?) into one single match statement. The trick is to put all the things you want to look at into a single tuple:
let isBalanced str =
let rec loop xs stack =
match (xs, stack) with
| '(' :: ys, stack -> loop ys ('(' :: stack)
| ')' :: ys, '(' :: stack -> loop ys stack
| ')' :: _, _ -> false
| _ :: ys, stack -> loop ys stack
| [], [] -> true
| [], _ -> false
loop (Seq.toList str) []
Related
This is my first time trying Haskell. I'm trying to make a function that takes an element and a list and removes the second appearance of the item. For example, if the element is 2 and the list is [2,3,4,2,5,2] the result would be [2, 3, 4, 5, 2].
However I am getting this error:
TareaHaskell.hs:36:69: error: parse error on input ‘)’
|
36 | | ( (a == x) && not (isItIn x newList) ) = ( (let newList = x:[]) && (deleteSecond a xs) )
Code:
isItIn :: (Eq a ) => a -> [a] -> Bool
isItIn a [] = False
isItIn a (x:xs) = if a == x
then True
else isItIn a xs
deleteSecond :: (Eq a ) => a -> ( [a] -> [a] )
deleteSecond a [] = newList
deleteSecond a (x:xs)
| ( (a == x) && not (isItIn x newList) ) = ( (let newList = x:[]) && (deleteSecond a xs) )
| (a == x) && (isItIn x newList) = (deleteSecond a xs)
| otherwise = let newList = x:[] && deleteSecond a xs
I read it might be a problem with the indentation however I already tried using spaces, moving it back and forth, and it still isn't working.
I am also using Notepad++ and Sublime to help with the indentation and nothing.
The problem is that the parser isn't expecting the ) in this code:
(let newList = x:[])
because it's invalid in Haskell to have a let statement that isn't followed by in. (With the exception of inside a do block.)
It's really not clear to me what your actual intention is, but all let is for is to give a more complex expression a temporary name inside a block of code. A let statement without an accompanying in doesn't make any sense, and is causing your parse error here.
Your code is a little over complicated. If you simplified it'd be more clear where the problem is, but
let newList = x:[]
is only valid if the next symbol is in. The parser is complaining that you're trying to parenthesize it for no reason. A let..in statement is just a way of temporarily binding a name to a value for the purpose of the expression.
let var = value in expr
As far as the problem itself: this is pretty easy to solve with explicit recursion.
removeSecond :: (Eq a) -> [a] -> [a]
removeSecond = go False
where go _ _ [] = []
go True needle (x:xs) | needle == x = xs
| otherwise = x : go True needle xs
go False needle (x:xs) | needle == x = x : go True needle xs
| otherwise = x : go False needle xs
sexp is like this: type sexp = Atom of string | List of sexp list, e.g., "((a b) ((c d) e) f)".
I have written a parser to parse a sexp string to the type:
let of_string s =
let len = String.length s in
let empty_buf () = Buffer.create 16 in
let rec parse_atom buf i =
if i >= len then failwith "cannot parse"
else
match s.[i] with
| '(' -> failwith "cannot parse"
| ')' -> Atom (Buffer.contents buf), i-1
| ' ' -> Atom (Buffer.contents buf), i
| c when i = len-1 -> (Buffer.add_char buf c; Atom (Buffer.contents buf), i)
| c -> (Buffer.add_char buf c; parse_atom buf (i+1))
and parse_list acc i =
if i >= len || (i = len-1 && s.[i] <> ')') then failwith "cannot parse"
else
match s.[i] with
| ')' -> List (List.rev acc), i
| '(' ->
let list, j = parse_list [] (i+1) in
parse_list (list::acc) (j+1)
| c ->
let atom, j = parse_atom (empty_buf()) i in
parse_list (atom::acc) (j+1)
in
if s.[0] <> '(' then
let atom, j = parse_atom (empty_buf()) 0 in
if j = len-1 then atom
else failwith "cannot parse"
else
let list, j = parse_list [] 1 in
if j = len-1 then list
else failwith "cannot parse"
But I think it is too verbose and ugly.
Can someone help me with an elegant way to write such a parser?
Actually, I always have problems in writing code of parser and what I could do only is write such a ugly one.
Any tricks for this kind of parsing? How to effectively deal with symbols, such as (, ), that implies recursive parsing?
You can use a lexer+parser discipline to separate the details of lexical syntax (skipping spaces, mostly) from the actual grammar structure. That may seem overkill for such a simple grammar, but it's actually better as soon as the data you parse has the slightest chance of being wrong: you really want error location (and not to implement them yourself).
A technique that is easy and gives short parsers is to use stream parsers (using a Camlp4 extension for them described in the Developping Applications with Objective Caml book); you may even get a lexer for free by using the Genlex module.
If you want to do really do it manually, as in your example above, here is my recommendation to have a nice parser structure. Have mutually recursive parsers, one for each category of your syntax, with the following interface:
parsers take as input the index at which to start parsing
they return a pair of the parsed value and the first index not part of the value
nothing more
Your code does not respect this structure. For example, you parser for atoms will fail if it sees a (. That is not his role and responsibility: it should simply consider that this character is not part of the atom, and return the atom-parsed-so-far, indicating that this position is not in the atom anymore.
Here is a code example in this style for you grammar. I have split the parsers with accumulators in triples (start_foo, parse_foo and finish_foo) to factorize multiple start or return points, but that is only an implementation detail.
I have used a new feature of 4.02 just for fun, match with exception, instead of explicitly testing for the end of the string. It is of course trivial to revert to something less fancy.
Finally, the current parser does not fail if the valid expression ends before the end of the input, it only returns the end of the input on the side. That's helpful for testing but you would do it differently in "production", whatever that means.
let of_string str =
let rec parse i =
match str.[i] with
| exception _ -> failwith "unfinished input"
| ')' -> failwith "extraneous ')'"
| ' ' -> parse (i+1)
| '(' -> start_list (i+1)
| _ -> start_atom i
and start_list i = parse_list [] i
and parse_list acc i =
match str.[i] with
| exception _ -> failwith "unfinished list"
| ')' -> finish_list acc (i+1)
| ' ' -> parse_list acc (i+1)
| _ ->
let elem, j = parse i in
parse_list (elem :: acc) j
and finish_list acc i =
List (List.rev acc), i
and start_atom i = parse_atom (Buffer.create 3) i
and parse_atom acc i =
match str.[i] with
| exception _ -> finish_atom acc i
| ')' | ' ' -> finish_atom acc i
| _ -> parse_atom (Buffer.add_char acc str.[i]; acc) (i + 1)
and finish_atom acc i =
Atom (Buffer.contents acc), i
in
let result, rest = parse 0 in
result, String.sub str rest (String.length str - rest)
Note that it is an error to reach the end of input when parsing a valid expression (you must have read at least one atom or list) or when parsing a list (you must have encountered the closing parenthesis), yet it is valid at the end of an atom.
This parser does not return location information. All real-world parsers should do so, and this is enough of a reason to use a lexer/parser approach (or your preferred monadic parser library) instead of doing it by hand. Returning location information here is not terribly difficult, though, just duplicate the i parameter into the index of the currently parsed character, on one hand, and the first index used for the current AST node, on the other; whenever you produce a result, the location is the pair (first index, last valid index).
Myello! So I am looking for a concise, efficient an idiomatic way in F# to parse a file or a string. I have a strong preference to treat the input as a sequence of char (char seq). The idea is that every function is responsible to parse a piece of the input, return the converted text tupled with the unused input and be called by a higher level function that chains the unused input to the following functions and use the results to build a compound type. Every parsing function should therefore have a signature similar to this one: char seq -> char seq * 'a . If, for example, the function's responsibility is simply to extract the first word, then, one approach would be the following:
let parseFirstWord (text: char seq) =
let rec forTailRecursion t acc =
let c = Seq.head t
if c = '\n' then
(t, acc)
else
forTailRecursion (Seq.skip 1 t) (c::acc)
let rest, reversedWord = forTailRecursion text []
(rest, List.reverse reversedWord)
Now, of course the main problem with this approach is that it extracts the word in reverse order and so you have to reverse it. Its main advantages however are that is uses strictly functional features and proper tail recursion. One could avoid the reversing of the extracted value while losing tail recursion:
let rec parseFirstWord (text: char seq) =
let c = Seq.head t
if c = '\n' then
(t, [])
else
let rest, tail = parseFirstWord (Seq.skip 1 t)
(rest, (c::tail))
Or use a fast mutable data structure underneath instead of using purely functional features, such as:
let parseFirstWord (text: char seq) =
let rec forTailRecursion t queue =
let c = Seq.head t
if c = '\n' then
(t, queue)
else
forTailRecursion (Seq.skip 1 t) (queue.Enqueu(c))
forTailRecursion text (new Queue<char>())
I have no idea how to use OO concepts in F# mind you so corrections to the above code are welcome.
Being new to this language, I would like to be guided in terms of the usual compromises that an F# developer makes. Among the suggested approaches and your own, which should I consider more idiomatic and why? Also, in that particular case, how would you encapsulate the return value: char seq * char seq, char seq * char list or evenchar seq * Queue<char>? Or would you even consider char seq * String following a proper conversion?
I would definitely have a look at FSLex. FSYacc, FParsec. However if you just want to tokenize a seq<char> you can use a sequence expression to generate tokens in the right order. Reusing your idea of a recursive inner function, and combinining with a sequence expression, we can stay tail recursive like shown below, and avoid non-idiomatic tools like mutable data structures.
I changed the separator char for easy debugging and the signature of the function. This version produces a seq<string> (your tokens) as result, which is probably easier to consume than a tuple with the current token and the rest of the text. If you just want the first token, you can just take the head. Note that the sequence is generated 'on demand', i.e. the input is parsed only as tokens are consumed through the sequence. Should you need the remainder of the input text next to each token, you can yield a pair in loop instead, but I'm guessing the downstream consumer most likely wouldn't (furthermore, if the input text is itself a lazy sequence, possibly linked to a stream, we don't want to expose it as it should be iterated through only in one place).
let parse (text : char seq) =
let rec loop t acc =
seq {
if Seq.isEmpty t then yield acc
else
let c, rest = Seq.head t, Seq.skip 1 t
if c = ' ' then
yield acc
yield! loop rest ""
else yield! loop rest (acc + string c)
}
loop text ""
parse "The FOX is mine"
val it : seq<string> = seq ["The"; "FOX"; "is"; "mine"]
This is not the only 'idiomatic' way of doing this in F#. Every time we need to process a sequence, we can look at the functions made available in the Seq module. The most general of these is fold which iterates through a sequence once, accumulating a state at each element by running a given function. In the example below accumulate is such a function, that progressively builds the resulting sequence of tokens. Since Seq.fold doesn't run the accumulator function on an empty sequence, we need the last two lines to extract the last token from the function's internal accumulator.
This second implementation keeps the nice characteriestics of the first, i.e. tail recursion (inside the fold implementation, if I'm not mistaken) and processing of the input sequence on demand. It also happens to be shorter, albeit a bit less readable probably.
let parse2 (text : char seq) =
let accumulate (res, acc) c =
if c = ' ' then (Seq.append res (Seq.singleton acc), "")
else (res, acc + string c)
let (acc, last) = text |> Seq.fold accumulate (Seq.empty, "")
Seq.append acc (Seq.singleton last)
parse2 "The FOX is mine"
val it : seq<string> = seq ["The"; "FOX"; "is"; "mine"]
One way of lexing/parsing in a way truly unique to F# is by using active patterns. The following simplified example shows the general idea. It can process a calculation string of arbitrary length without producing a stack overflow.
let rec (|CharOf|_|) set = function
| c :: rest when Set.contains c set -> Some(c, rest)
| ' ' :: CharOf set (c, rest) -> Some(c, rest)
| _ -> None
let rec (|CharsOf|) set = function
| CharOf set (c, CharsOf set (cs, rest)) -> c::cs, rest
| rest -> [], rest
let (|StringOf|_|) set = function
| CharsOf set (_::_ as cs, rest) -> Some(System.String(Array.ofList cs), rest)
| _ -> None
type Token =
| Int of int
| Add | Sub | Mul | Div | Mod
| Unknown
let lex: string -> _ =
let digits = set ['0'..'9']
let ops = Set.ofSeq "+-*/%"
let rec lex chars =
seq { match chars with
| StringOf digits (s, rest) -> yield Int(int s); yield! lex rest
| CharOf ops (c, rest) ->
let op =
match c with
| '+' -> Add | '-' -> Sub | '*' -> Mul | '/' -> Div | '%' -> Mod
| _ -> failwith "invalid operator char"
yield op; yield! lex rest
| [] -> ()
| _ -> yield Unknown }
List.ofSeq >> lex
lex "1234 + 514 / 500"
// seq [Int 1234; Add; Int 514; Div; Int 500]
I'm trying to solve tasks from 99 Haskell problems in F#.
The task #7 looks pretty simple, and the solution can be found in lots of places. Except the fact that the first several solutions that I've tried and found by googling (e.g. https://github.com/paks/99-FSharp-Problems/blob/master/P01to10/Solutions.fs) are wrong.
My example is pretty simple.
I'm trying to build extremely deep nested structure and fold it
type NestedList<'a> =
| Elem of 'a
| NestedList of NestedList<'a> list
let flatten list =
//
(* StackOverflowException
| Elem(a) as i -> [a]
| NestedList(nest) -> nest |> Seq.map myFlatten |> List.concat
*)
// Both are failed with stackoverflowexception too https://github.com/paks/99-FSharp-Problems/blob/master/P01to10/Solutions.fs
let insideGen count =
let rec insideGen' count agg =
match count with
| 0 -> agg
| _ ->
insideGen' (count-1) (NestedList([Elem(count); agg]))
insideGen' count (Elem(-1))
let z = insideGen 50000
let res = flatten z
I've tried to rewrite solution in CPS style, but eiter I'm doing something wrong or look into incorrect direction - everything that I've tried isn't working.
Any advices?
p.s. Haskell solution, at least on nested structure with 50000 nested levels is working slowly, but without stack overflow.
Here's a CPS version that doesn't overflow using your test.
let flatten lst =
let rec loop k = function
| [] -> k []
| (Elem x)::tl -> loop (fun ys -> k (x::ys)) tl
| (NestedList xs)::tl -> loop (fun ys -> loop (fun zs -> k (zs # ys)) xs) tl
loop id [lst]
EDIT
A much more readable way to write this would be:
let flatten lst =
let results = ResizeArray()
let rec loop = function
| [] -> ()
| h::tl ->
match h with
| Elem x -> results.Add(x)
| NestedList xs -> loop xs
loop tl
loop [lst]
List.ofSeq results
Disclaimer - I'm not a deep F# programmer and this will not be idiomatic.
If your stack is overflowing, it means that you don't have a tail recursive solution. It also means that you are choosing to use stack memory for state. Traditionally, you want to exchange heap memory for stack memory since heap memory is in comparatively large supply. So the trick is to model a stack.
I'm going to define a virtual machine that is a stack. Each stack element will be a state nugget for traversing a list which will include the list and a program counter, which is the current element to examine and will be a tuple of a NestedList<'a> list * int. The list is the current list being traversed. The int is the current position in the list.
type NestedList<'a> =
| Elem of 'a
| Nested of NestedList<'a> list
let flatten l =
let rec listMachine instructions result =
match instructions with
| [] -> result
| (currList, currPC) :: tail ->
if currPC >= List.length currList then listMachine tail result
else
match List.nth currList currPC with
| Elem(a) -> listMachine ((currList, currPC + 1 ) :: tail) (result # [ a ])
| Nested(l) -> listMachine ((l, 0) :: (currList, currPC + 1) :: instructions.Tail) result
match l with
| Elem(a) -> [ a ]
| Nested(ll) -> listMachine [ (ll, 0) ] []
What have I done? I've written a tail-recursive function that operates of "Little Lisper" style code - if my instruction list is empty, return my accumulated result. If not, operate on the top of the stack. I bind a convenience variable to the top and if the PC is at the end, I recurse on the tail of the stack (pop) with the current result. Otherwise, I look at the current element in the list. If it's an Elem, I recurse, advancing the PC and appending the Elem onto the list. If it's not an elem, I recurse, by pushing a new stack with the NestedList followed by the current stack elem with the PC advanced by 1 and everything else.
Suppose I have the following code:
type Vehicle =
| Car of string * int
| Bike of string
let xs = [ Car("family", 8); Bike("racing"); Car("sports", 2); Bike("chopper") ]
I can filter above list using incomplete pattern matching in an imperative for loop like:
> for Car(kind, _) in xs do
> printfn "found %s" kind;;
found family
found sports
val it : unit = ()
but it will cause a:warning FS0025: Incomplete pattern matches on this expression. For example, the value 'Bike (_)' may indicate a case not covered by the pattern(s). Unmatched elements will be ignored.
As the ignoring of unmatched elements is my intention, is there a possibility to get rid of this warning?
And is there a way to make this work with list-comprehensions without causing a MatchFailureException? e.g. something like that:
> [for Car(_, seats) in xs -> seats] |> List.sum;;
val it : int = 10
Two years ago, your code was valid and it was the standard way to do it. Then, the language has been cleaned up and the design decision was to favour the explicit syntax. For this reason, I think it's not a good idea to ignore the warning.
The standard replacement for your code is:
for x in xs do
match x with
| Car(kind, _) -> printfn "found %s" kind
| _ -> ()
(you could also use high-order functions has in pad sample)
For the other one, List.sumBy would fit well:
xs |> List.sumBy (function Car(_, seats) -> seats | _ -> 0)
If you prefer to stick with comprehensions, this is the explicit syntax:
[for x in xs do
match x with
| Car(_, seats) -> yield seats
| _ -> ()
] |> List.sum
You can silence any warning via the #nowarn directive or --nowarn: compiler option (pass the warning number, here 25 as in FS0025).
But more generally, no, the best thing is to explicitly filter, as in the other answer (e.g. with choose).
To explicitly state that you want to ignore unmatched cases, you can use List.choose and return None for those unmatched elements. Your codes could be written in a more idomatic way as follows:
let _ = xs |> List.choose (function | Car(kind, _) -> Some kind
| _ -> None)
|> List.iter (printfn "found %s")
let sum = xs |> List.choose (function | Car(_, seats)-> Some seats
| _ -> None)
|> List.sum