So i have this code
let readLines filePath = System.IO.File.ReadLines(filePath);
let lines = Seq.toList (readLines "MovieSmall.txt");
let lines2 = List.map (fun (s:string) -> Array.toList (s.Split([|'\t'|]))) lines;
let imdb = List.map (fun (l1::l2::l3::l4::l5::l6::[]) -> [l1]::[l2]::[l3]:: (Array.toList ((l4:string).Split[|','|])) ::[l5]:: (Array.toList ((l6:string).Split[|','|])) ::[]) lines2;;
That gets data from a file called MovieSmall and parses it into a String Linked List.
MovieSmall.txt looks like this
The Shawshank Redemption 1994 9.30 Crime, Drama Frank Darabont Morgan Freeman
The Godfather 1972 9.20 Crime, Drama Francis Ford Coppola Al Pacino, James Caan, Robert Duvall, Diane Keaton, Talia Shire
Universal's Cinematic Spectacular: 100 Years of Movie Memories 2012 9.20 Documentary, Short Mike Aiello Morgan Freeman
I get
let imdb = List.map (fun (l1::l2::l3::l4::l5::l6::[]) -> [l1]::[l2]::[l3]:: (Array.toList ((l4:string).Split[|','|])) ::[l5]:: (Array.toList ((l6:string).Split[|','|])) ::[]) lines2;;
--------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^
stdin(4,27): warning FS0025: Incomplete pattern matches on this expression. For example, the value '[_;_;_;_;_;_;_]' may indicate a case not covered by the pattern(s).
Which I'm sure is ok because all the data fed to it will have the format above, but when I run
let rec get_rating movie = function
| [] -> "Not found";
| [[title]; b; c; d; e; f]::t -> if title=movie then (string)c else get_rating movie t;
get_rating "Batman Begins" imdb;;
i get
let rec get_rating movie = function
---------------------------^^^^^^^^
stdin(24,28): warning FS0025: Incomplete pattern matches on this expression. For example, the value '[[_;_;_;_;_;_;_]]' may indicate a case not covered by the pattern(s).
val get_rating :
movie:'a -> _arg1:'a list list list -> string when 'a : equality
val it : string = "[8.30]"
While adding the case of
let rec get_rating movie = function
| [] -> "Not found";
| [[title]; b; c; d; e; f]::t -> if title=movie then (string)c else get_rating movie t;
|_-> "Does not match";
get_rating "Batman Begins" imdb;
Stops it from happening I dunno if this is the correct way to approach this.
Thoughts?
When you only want to match on a small subset of all the possible patterns, as you're doing here, then finishing the match statement with _ -> default logic is exactly what you want to do.
The F# compiler is throwing that warning because when you leave your pattern matches incomplete you're opening your code up to run time failures when unexpected data comes through. Added the | _ -> pattern at the end lets you dictate what action to take when something unexpected happens.
Long story short, you're taking the right approach by having the _-> "Does not match" in your code.
You can disable compiler warnings:
#nowarn "25"
Related
Can somebody help me with article of Tomas Petricek: http://tomasp.net/blog/fsharp-dynamic-lookup.aspx/#dynfslinks?
The problem is that it is severely outdated. I understand that namespaces
open Microsoft.FSharp.Quotations.Typed
open Microsoft.FSharp.Quotations.Raw
are gone. So I removed the openings. But there are still errors. "Typed" is not defined. "RecdGet" is not defined. And I suspect they are not the last. I'm trying to prove to my boss that F# is good to use for database normalization. Dynamic lookup of fields would really helped me to deal with similarly named fields having different prefixes.
There is also post of Tomas on fpish: https://fpish.net/topic/None/57493, which I understand predates the article
Here's a rough equivalent:
open Microsoft.FSharp.Quotations
open Microsoft.FSharp.Quotations.Patterns
type DynamicMember<'t,'u> = Expr<'t -> 'u>
let getValueReader (expr:DynamicMember<'recdT, 'fieldT>) =
// Match the quotation representing the symbol
match expr with
| Lambda(v, PropertyGet (Some (Var v'), pi, [])) when v = v' ->
// It represents reading of the F# record field..
// .. get a function that reads the record field using F# reflection
let rdr = Reflection.FSharpValue.PreComputeRecordFieldReader pi
// we're not adding any additional processing, so we just
// simply add type conversion to the correct types & return it
((box >> rdr >> unbox) : 'recdT -> 'fieldT)
| _ ->
// Quotation doesn't represent symbol - this is an error
failwith "Invalid expression - not reading record field!"
type SampleRec = { Str : string; Num : int }
let readStrField = getValueReader <# fun (r : SampleRec) -> r.Str #>
let readNumField = getValueReader <# fun (r : SampleRec) -> r.Num #>
let rc = { Str = "Hello world!"; Num = 42 }
let s, n = readStrField rc, readNumField rc
printfn "Extracted: %s, %d" s n
If I have a type named Person, and list of functions, for example...
let checks = [checkAge; checkWeight; checkHeight]
...where each function is of the type (Person -> bool), and I want to do the equivalent of...
checkAge >> checkWeight >> checkHeight
...but I don't know in advance what functions are in the list, how would I do it?
I tried the following...
checks |> List.reduce (>>)
...but this gives the following error...
error FS0001: Type mismatch. Expecting a
(Person -> bool) -> (Person -> bool) -> Person -> bool
but given a
(Person -> bool) -> (bool -> 'a) -> Person -> 'a
The type 'Person' does not match the type 'bool'
What am I doing wrong?
It looks like Railway oriented programming would be a good fit here.
If you choose to go this route, you basically have two options.
You can either go all in, or the quick route.
Quick route
You rewrite your validation functions to take a Person option instead of just plain Person.
let validAge (record:Record option) =
match record with
| Some rec when rec.Age < 65 && rec.Age > 18 -> record
| None -> None
Now you should be able to easily chain your function.
checks |> List.reduce (>>)
All in
Alternatively, if you are lazy and don't want to match .. with in every validation function, you can write some more code.
(samples taken from [1])
First there's a bit of setup to do.
We'll define a special return type, so we can get meaningful error messages.
type Result<'TSuccess,'TFailure> =
| Success of 'TSuccess
| Failure of 'TFailure
A bind function, to bind the validations together
let bind switchFunction =
function
| Success s -> switchFunction s
| Failure f -> Failure f
You'll have to rewrite your validation functions as well.
let validAge (record:Record) =
if record.Age < 65 && record.Age > 18 then Success input
else Failure "Not the right age bracket"
Now combine with
checks |> List.reduce (fun acc elem -> acc >> bind elem)
Either way, check out the original article.
There's much more there you might be able to use :)
Edit: I just noticed that I was too slow in writing this answer once again.
Besides, I think Helge explained the concetp better than I did as well.
You may somehow have stumbled upon a dreaded concept. Apperently its the Voldemort (dont say his name!) of functional programming.
With no further ado lets walk right into the code:
type Person =
{ Name : string
Age : int
Weight : int
Height : int }
type Result =
| Ok of Person
| Fail
let bind f m =
match m with
| Ok p -> f p
| _ -> Fail
let (>=>) f1 f2 = f1 >> (bind f2)
let checkAge p =
if p.Age > 18 then Ok(p)
else Fail
let checkWeight p =
if p.Weight < 80 then Ok(p)
else Fail
let checkHeight p =
if p.Height > 150 then Ok(p)
else Fail
let checks = [ checkAge; checkWeight; checkHeight ]
let allcheckfunc = checks |> List.reduce (>=>)
let combinedChecks =
checkAge
>=> checkWeight
>=> checkHeight
let p1 =
{ Name = "p1"
Age = 10
Weight = 20
Height = 110 }
let p2 =
{ Name = "p2"
Age = 19
Weight = 65
Height = 180 }
allcheckfunc p1
combinedChecks p1
allcheckfunc p2
combineChecks p2
At this point I could throw around a lot of weirdo lingo (not really true, I couldnt...), but lets just look at what I have done.
I dropped your return value of bool and went for another type (Result) with either (mark that keyword!) Ok or Fail.
Then made a helper (bind) to wrap and unwrapp stuff from that Result-type.
And a new operator (>=>) to combine the stuff in reduce.
Mind that the first check-function to Fail will shortcut the entire chain and more or less immediately (not calling the other functions) return Fail. In addition you will not know where in this chain it did Fail or which functions ahead of any Fail did actually Ok.
There are methods to also accumulate the errors as you go along, so that you get get a feedback of type: "the checkAge returned Fail, but the others was great success"
The code is mostly stolen from here: http://fsharpforfunandprofit.com/posts/recipe-part2/
And you may want to read about the entire website of Wlaschin and even a lot more to get into the finer and harder details if wanted.
Good luck on your journey to the upper floors of the Ivory Tower. ;-)
Footnote: This is called an Either-monad usually. Its not entirely finished and what not in the above code, but never mind... I think it will work in your case...
The >> operator is useful if you have functions that perform some transformation. For example, if you had a list of functions Person -> Person that turn one person into another.
In your case, it seems that you have functions Person -> bool and you want to build a composed function that returns true if all functions return true.
Using List.reduce you can write:
checks|> List.reduce (fun f g -> (fun p -> f p && g p))
Perhaps an easier option is to just write a function that takes a person and uses List.forall:
let checkAll checks person = checks |> List.forall (fun f -> f person)
Let's say you have this union:
type Thing =
| Eagle
| Elephant of int
And your code has a list of Elephants, as in
let l = [Elephant (1000); Elephant (1200)]
And you wanted to iterate over l, and print out the data associated with each Elephant. Is there a way to do so without using a pattern match?
In your example, you say that you have a list of elephants - which is true in this case - but the type of l is really a list of Thing values and so it can contain both elephants and eagles. This is why you need to use pattern matching - to handle all possible cases.
If you regularly need to use list that contain only elephants, then it might make sense to define a separate type of elephants. Something like:
type ElephantInfo = { Size : int }
type Thing =
| Elephant of ElephantInfo
| Eagle
Now you can create a list of type list<ElephantInfo> which can contain just elephants and so you don't need pattern matching:
let l1 = [ {Size=1}; {Size=2} ]
for el in l1 do printfn "%d" el.Size
On the other hand, if you want to mix elephants and eagles, you'll create list<Thing> and then use pattern matching:
let l2 = [ Elephant {Size=1}; Eagle ]
You could do this:
l
|> List.collect (function Elephant x -> [x] | _ -> [])
|> List.iter (printfn "%i")
Prints
1000
1200
It still uses pattern matching, but it's fairly minimal.
You have of course the option of going full Ivory Tower (® Scott Wlaschin)
As in about:
type Thing =
| Eagle
| Elephant of int
type MaybeElephantBuilder() =
member this.Bind(x, f) =
match x with
| Eagle -> 0
| Elephant a -> f a
member this.Return(x) = x
let maybeElephant = new MaybeElephantBuilder()
let l =
[ Elephant(1000)
Elephant(1200)
]
let printIt v =
let i =
maybeElephant {
let! elephantValue = v
return elephantValue
}
printfn "%d" i
l |> Seq.iter printIt
It will even handle the stuff with the Eagles thrown in there!
Well...
Remove the non-Eagles and the code will fly...
let l =
[ Eagle
Leadon
Elephant(1000)
Eagle
Meisner
Elephant(1200)
Eagle
Felder
]
l |> Seq.iter printIt
But no. Its not nice. Its not short. Its more for fun (if that!) than anything else. Its probably the worst misuse of F# computation expressions ever too!
And you will need pattern matching somewhere.
Thx Scott! And Petricek.
Computation Expression Zoo for real! ;-)
You can use reflection from Microsoft.FSharp.Reflection Namespace but it is much more cumbersome and slow.
Pattern matching is probably the easiest way to get data from discriminated union.
(Also you have a list of Things all its members happen to be of Elephant union case).
There's a way to place the pattern match into the header of the function (or a let binding). It is still a pattern match, though.
// This function takes a tuple:
// the first argument is a Thing,
// the second is "default" weight to be processed if the first one is NOT an Elephant
let processElephant (Elephant weight, _ | _, weight) =
weight
let [<Literal>] NON_ELEPHANT_WEIGHT = -1
// usage:
let totalWeight =
[Elephant (1000); Elephant (1200)]
|> List.sumBy (fun el -> processElephant(el, NON_ELEPHANT_WEIGHT))
This question and its answers provide with more details.
I processed some HTML to extract various information from a website (no proper API exists there), and generated a list of tokens using an F# discriminated union. I have simplified my code to the essence:
type tokens =
| A of string
| B of int
| C of string
let input = [A "1"; B 2; C "2.1"; C "2.2"; B 3; C "3.1"]
// how to transform the input to the following ???
let desiredOutput = [A "1", [[ B 2, [ C "2.1"; C "2.2" ]]; [B 3, [ C "3.1" ]]]]
This roughly corresponds to parsing the grammar: g -> A b* ; b -> B c* ; c-> C
The key thing is my token list is flat, but I want to work with the hierarchy implied by the grammar.
Perhaps there is another representation of my desiredOutput which would be better; what I really want to do is process exactly one A followed by a zero or more sequence of Bs, which happen to contain zero or more Cs.
I've looked at parser combinators articles, e.g. about FParsec, but I couldn't find a good solution that allows me to start from a list of tokens rather than a stream of characters. I'm familiar with imperative techniques for parsing, but I don't know what is idiomatic F#.
Progress made due to Answer
Thanks to the answer from Vandroiy, I was able to write the following to move forward a hobby project I am working on to learn idiomatic F# (and also to scrape quiz websites).
// transform flat data scraped from a Quiz website into a hierarchical data structure
type ScrapedQuiz =
| Title of string
| Description of string
| Blurb of string * picture: string
| QuizId of string
| Question of num:string * text:string * picture : string
| Answer of text:string
| Error of exn
let input =
[Title "Example Quiz Scraped from a website";
Description "What the Quiz is about";
Blurb ("more details","and a URL for a picture");
Question ("#1", "How good is F#", "URL to picture of new F# logo");
Answer ("we likes it");
Answer ("we very likes it");
Question ("#2", "How useful is Stack Overflow", "URL to picture of Stack Overflow logo");
Answer ("very good today");
Answer ("lobsters");
]
type Quiz =
{ Title : string
Description : string
Blurb : string * PictureURL
Questions : Quest list }
and Quest =
{ Number : string
Text : string
Pic : PictureURL
Answers : string list}
and PictureURL = string
let errorMessage = "unexpected input format"
let parseList reader input =
let rec run acc inp =
match reader inp with
| Some(o, inp') -> run (o :: acc) inp'
| None -> List.rev acc, inp
run [] input
let readAnswer = function Answer(a) :: t -> Some(a, t) | _ -> None
let readDescription =
function Description(a) :: t -> (a, t) | _ -> failwith errorMessage
let readBlurb = function Blurb(a,b) :: t -> ((a,b),t) | _ -> failwith errorMessage
let readQuests = function
| Question(n,txt,pic) :: t ->
let answers, input' = parseList readAnswer t
Some( { Number=n; Text=txt; Pic=pic; Answers = answers}, input')
| _ -> None
let readQuiz = function
| Title(s) :: t ->
let d, input' = readDescription t
let b, input'' = readBlurb input'
let qs, input''' = parseList readQuests input''
Some( { Title = s; Description = d; Blurb = b; Questions = qs}, input''')
| _ -> None
match readQuiz input with
| Some(a, []) -> a
| _ -> failwith errorMessage
I could not have written this yesterday; neither the target data type, nor the parsing code. I see room for improvement, but I think I have started to meet my goal of not writing C# in F#.
Indeed, it might help to first find a good representation.
Original output format
I presume the suggested output form, in standard printing, would be:
[(A "1", [(B 2, [C "2.1"; C "2.2"]); (B 3, [C "3.1"])])]
(This differs from the one in the question in the amount of list levels.) The code I used to get there is ugly. In part, this is because it abstracts at an awkward position, constraining input and output types very far without giving them a well-defined type. I'm posting it for the sake of completeness, but I recommend to skip over it.
let rec readBranch checkOne readInner acc = function
| h :: t when checkOne h ->
let dat, inp' = readInner t
readBranch checkOne readInner ((h, dat) :: acc) inp'
| l -> List.rev acc, l
let rec readCs acc = function
| C(s) :: t -> readCs (C(s) :: acc) t
| l -> List.rev acc, l
let readBs = readBranch (function B _ -> true | _ -> false) (readCs []) []
let readAs = readBranch (function A _ -> true | _ -> false) readBs []
input |> readAs |> fst
Surely, other people can do this more sensibly, but I doubt it would tackle the main problem: we're just projecting one weird data structure to the next. If it is difficult to read or formulate a parser's output format, there is probably something going wrong.
Strongly typed output
Rather than focus on how we are parsing, I prefer to first pay attention to what we are parsing into. These A B C things don't mean anything to me. Let's say they represent objects:
type Bravo =
{ ID : int
Charlies : string list }
type Alpha =
{ Name : string
Bravos : Bravo list }
There are two places where sequences of objects of the same type are parsed. Let's create a helper that repeatedly uses a specific parser to read a list of objects:
/// Parses objects into a list. reader takes an input and returns either
/// Some(parsed item, new input state), or None if the list is finished.
/// Returns a list of parsed objects and the remaining input.
let parseList reader input =
let rec run acc inp =
match reader inp with
| Some(o, inp') -> run (o :: acc) inp'
| None -> List.rev acc, inp
run [] input
Note that this is quite generic in the type of input. This helper could be used with strings, sequences, or whatever.
Now, we add concrete parsers. The following functions have the signature used in reader in the helper; they either return the parsed object and the remaining input, or None if parsing wasn't possible.
let readC = function C(s) :: t -> Some(s, t) | _ -> None
let readB = function
| B(i) :: t ->
let charlies, input' = parseList readC t
Some( { ID = i; Charlies = charlies }, input' )
| _ -> None
let readA = function
| A(s) :: t ->
let bravos, input' = parseList readB t
Some( { Name = s; Bravos = bravos }, input' )
| _ -> None
The code for reading Alphas and Bravos is practically a duplicate. If that happens in production code, I would recommend again to check whether the data structure is optimal, and only look at improving the algorithm afterwards.
We request to read one A into one Alpha, which was the goal after all:
match readA input with
| Some(a, []) -> a
| _ -> failwith "Unexpected input format"
There may be many better ways to do the parsing, especially when knowing more about the exact problem. The important fact is not how the parser works, but what the output looks like, which will be the focus when actual work is done in the program. The second version's output should be much easier to navigate in both code and debugger:
val it : Alpha =
{ Name = "1";
Bravos = [ { ID = 2; Charlies = ["2.1"; "2.2"] }
{ ID = 3; Charlies = ["3.1"] } ] }
One could take this a step further and replace the tokenized data structure with DOM (Document Object Model). Then, the first step would be to read HTML into DOM using a standard parsing library. In a second step, the concrete parsers would construct objects, using the DOM representation as input, calling one another top-down.
To work with structured hierarchy, you have to create matching structure of types. Something like
type
RootType = Level1 list
and
Level1 =
| A of string
| B of Level2 list
| C of string
and
Level2 =
{ b: int; c: string list }
I am trying to write a code to remove stopwords like "the", "this" in a string list etc.
I wrote this code:
let rec public stopword (a : string list, b :string list) =
match [a.Head] with
|["the"]|["this"] -> stopword (a.Tail, b)
|[] -> b
|_ -> stopword (a.Tail, b#[a.Head])
I ran this in the interactive:
stopword (["this";"is";"the"], []);;
I got this error:
This expression was expected to have type string list but here has type 'a * 'b
Match expressions in F# are very powerful, although the syntax is confusing at first
You need to match the list like so:
let rec stopword a =
match a with
|"the"::t |"this"::t -> stopword t
|h::t ->h::(stopword t)
|[] -> []
The actual error is due to the function expecting a tuple argument. You would have to call the function with:
let result = stopword (["this";"is";"the"], [])
Edit: since the original question was changed, the above answer is not valid anymore; the logical error in the actual function is that you end up with a single element list of which the tail is taken, resulting in an empty list. On the next recursive call the function chokes on trying to get the head of this empty list
The function in itself is not correctly implemented though and much more complicated than necessary.
let isNoStopword (word:string) =
match word with
| "the"|"this" -> false
| _ -> true
let removeStopword (a : string list) =
a |> List.filter(isNoStopword)
let test = removeStopword ["this";"is";"the"]
Others have mentioned the power of pattern matching in this case. In practice, you usually have a set of stopwords you want to remove. And the when guard allows us to pattern match quite naturally:
let rec removeStopwords (stopwords: Set<string>) = function
| x::xs when Set.contains x stopwords -> removeStopwords stopwords xs
| x::xs -> x::(removeStopwords stopwords xs)
| [] -> []
The problem with this function and #John's answer is that they are not tail-recursive. They run out of stack on a long list consisting of a few stopwords. It's a good idea to use high-order functions in List module which are tail-recursive:
let removeStopwords (stopwords: Set<string>) xs =
xs |> List.filter (stopwords.Contains >> not)