F# list pattern matching limitations or just bad writing? - f#

Given some CreditScoreInput:
type CreditScoreInput = { id: string; score: string; years: int }
let input = [
{ id = "CUSTOMER001"; score = "Medium"; years = 1 }
{ id = "CUSTOMER001"; score = "Medium"; years = 1 }
{ id = "CUSTOMER002"; score = "Medium"; years = 10 }
{ id = "CUSTOMER003"; score = "Bad"; years = 0 }
{ id = "CUSTOMER003"; score = "Bad"; years = 0 }
{ id = "CUSTOMER003"; score = "Bad"; years = 0 }
{ id = "CUSTOMER004"; score = "Good"; years = 0 }
{ id = "CUSTOMER005"; score = "Good"; years = 10 }
]
My function validateDuplicates is out looking for duplicates:
let validate list =
match list with
| [] -> failwith "No customers supplied!"
| _ -> list
let validateDuplicates (group:string * list<CreditScoreInput>) =
match group with
| (id, g) when g.Length = 1 -> printf $"No duplicates for {id} is OK.\n"
| (id, [input1; input2]) -> printf $"Two duplicates for {id} is OK.\n"
| (id, g) when g.Length > 2 -> printf $"More than two duplicates for {id} is NOT OK.\n"
true
input
|> validate
|> List.groupBy (fun i -> i.id)
|> List.forall (fun i -> i |> validateDuplicates)
|> ignore
Inside of validateDuplicates I notice a little squiggly under group, leading to the warning:
Incomplete pattern matches on this expression. For example, the value (_,[_;_;_]) may indicate a case not covered by the pattern(s). However, a pattern rule with a when clause might successfully match this value.
Is there a way I can play nice with the compiler to avoid this warning?
Update
I am not sure whether I should do this here but here are my changes based on the excellent guidance:
let validateDuplicates (group:string * list<CreditScoreInput>) =
match group with
| (id, [_]) -> printf $"No duplicates for {id} is OK.\n"
| (id, [_; _]) -> printf $"Two duplicates for {id} is OK.\n"
| (id, g) -> printf $"More than two duplicates for {id} is NOT OK.\n"
input
|> validate
|> List.groupBy (fun i -> i.id)
|> List.iter (fun i -> i |> validateDuplicates)

Just get rid of the last when clause, because you know it will always be true:
match group with
| (id, g) when g.Length = 1 -> printf $"No duplicates for {id} is OK.\n"
| (id, [input1; input2]) -> printf $"Two duplicates for {id} is OK.\n"
| (id, g) -> printf $"More than two duplicates for {id} is NOT OK.\n"
Proof:
g.Length can never be negative or 0
If g.Length is 1 then it will match the first case
If g.Length is 2 then it will match the second case
Therefore, g.Length will always be > 2 if control reaches the third case.
Here's how I would suggest you write this code instead:
let validateDuplicates (id, g : List<_>) =
match g.Length with
| 0 -> failwith "Unexpected"
| 1 -> printf $"No duplicates for {id} is OK.\n"
| 2 -> printf $"Two duplicates for {id} is OK.\n"
| _ -> printf $"More than two duplicates for {id} is NOT OK.\n"
input
|> validate
|> List.groupBy (fun i -> i.id)
|> List.iter validateDuplicates
The changes I've made are:
Use List.iter instead of piping List.forall into ignore.
Eliminate unneeded lambda in invocation of validateDuplicates.
Use pattern matching to deconstruct the input to validateDuplicates
Match directly on g.Length instead of using when clauses.
Defensive programming: explicitly check for an empty list to make your intention clear.
You might also want to consider making your validation functions pure (i.e. no side-effects) via F#'s Result type.

Related

How to group data attached to discriminated union values, in F#?

Here is an example:
type Events =
| A of AData
| B of BData
| C of CData
and I have a list of those:
let events : Events list = ...
I need to build a list by event type. Right now I do this:
let listA =
events
|> List.map (fun x ->
match x with
| A a -> Some a
| _ -> None
)
|> List.choose id
and, repeat for each type...
I also thought I could do something like:
let rec split events a b c =
match events with
| [] -> (a |> List.rev, b |> List.rev, c |> List.rev)
| h :: t ->
let a, b, c =
match h with
| A x -> x::a, b, c
| B x -> a, x::b, c
| C x -> a, b, x::c
split t a b c
Is there a more elegant manner to solve this?
This processes a lot of data, so speed is important here.
You can fold back the list of events to avoid writing a recursive function and reversing results. With an anonymous record you will need to define it first and then pipe both arguments ||> to List.foldBack:
let eventsByType =
(events, {| listA = []; listB = []; listC = [] |})
||> List.foldBack (fun event state ->
match event with
| A a -> {| state with listA = a :: state.listA |}
| B b -> {| state with listB = b :: state.listB |}
| C c -> {| state with listC = c :: state.listC |})
With a named record it is more elegant:
{ listA = []; listB = []; listC = [] } |> List.foldBack addEvent events
addEvent is the same as the lambda above except usage of a named record {} instead of {||}.
I think your solution is pretty good, although you do pay a price for reversing the lists. The only other semi-elegant approach I can think of is to unzip a list of tuples:
let split events =
let a, b, c =
events
|> List.map (function
| A n -> Some n, None, None
| B s -> None, Some s, None
| C b -> None, None, Some b)
|> List.unzip3
let choose list = List.choose id list
choose a, choose b, choose c
This creates several intermediate lists, so careful internal use of Seq or Array instead might perform better. You would have to benchmark to be sure.
Test case:
split [
A 1
A 2
B "one"
B "two"
C true
C false
] |> printfn "%A" // [1; 2],[one; two],[true; false]
By the way, your current solution can be simplified to:
let listA =
events
|> List.choose (function A a -> Some a | _ -> None)
If you keep the union cases, you can group the list items like this.
let name = function
| A _ -> "A"
| B _ -> "B"
| C _ -> "C"
let lists =
events
|> List.groupBy name
|> dict
And then you can extract the data you want.
let listA = lists["A"] |> List.map (fun (A data) -> data)
(The compiler doesn't realize the list only consists of "A" cases, so it gives an incomplete pattern match warning😀)

match by value in a discriminated union, in F#

with this union:
type T =
| A
| B
| C
and a T list
I would like to implement something like this pseudo code:
let countOfType (t: Type) (l: T list) =
l
|> List.filter (fun x -> x.GetType() = t)
|> List.length
when I would pass if I want to count the 'A', 'B', etc..
but A.GetType() and B.GetType() return the T type, so this doesn't work.
Is there a way where I could check the type by passing it as a parameter?
The practical case here is that I have a Map that gets updated every few seconds and its values are part of the same DU. I need to be able to see how many of each type, without having to update the code (like a match block) each time an entry gets added.
Addendum:
I simplified the original question too much and realized it after seeing Fyodor's answer.
So I would like to add the additional part:
how could this also be done for cases like these:
type T =
| A of int
| B of string
| C of SomeOtherType
For such enum type T as you specified, you can just use regular comparison:
let countOfType t (l: T list) =
l
|> List.filter (fun x -> x = t)
|> List.length
Usage:
> countOfType A [A; A; B; C; A]
3
> countOfType B [A; A; B; C; A]
1
Try List.choose: ('a -> 'b option) -> 'a list -> 'b list, it filters list based on 'a -> 'b option selector. If selectors evaluates to Some, then value will be included, if selector evaluates to None, then value will be skipped. If you worry about allocations caused by instantiation of Some, then you'll have to implement version that will use ValueOption
let onlyA lis =
lis |> List.choose (function
| (A _) as a -> Some a
| _ -> None)
let onlyB lis =
lis |> List.choose (function
| (B _) as b -> Some b
| _ -> None)
let lis = [
A 1
A 22
A 333
B ""
B "123"
]
lis |> onlyA |> List.length |> printfn "%d"
You can pattern match, and throw away the data, to create a function for the filter.
type T =
| A of int
| B of string
| C of float
[A 3;A 1;B "foo";B "bar";C 3.1; C 4.6]
|> List.filter (fun x ->
match x with
| A _ -> true
| B _ -> false
| C _ -> false
)
|> List.length
But in general i would asume, that you create a predicate function in your modul.
let isA x =
match x with
| A _ -> true
| _ -> false
if you have those functions you can just write
[A 3;A 1;B "foo";B "bar";C 3.1; C 4.6]
|> List.filter isA
|> List.length

How do I separate case ids from case values on a Discriminated Union?

I want to build a dictionary from a list of items.
An item has the following definition:
type Item =
| A of TotalPrice * Special
| B of TotalPrice * Special
| C of TotalPrice
| D of TotalPrice
I want the keys of the dictionary to map to the case ids:
| A
| B
| C
| D
I would then have the values for the case id be a list.
How do I separate the case ids from the case values?
Example:
let dictionary = items |> List.map (fun item -> item) // uh...
Appendix:
module Checkout
(*Types*)
type UnitPrice = int
type Qty = int
type Special =
| ThreeForOneThirty
| TwoForFourtyFive
type TotalPrice = { UnitPrice:int ; Qty:int }
type Item =
| A of TotalPrice * Special
| B of TotalPrice * Special
| C of TotalPrice
| D of TotalPrice
(*Functions*)
let totalPrice (items:Item list) =
let dictionary = items |> List.map (fun item -> item) // uh...
0
(*Tests*)
open FsUnit
open NUnit.Framework
[<Test>]
let ``buying 2 A units, B unit, A unit = $160`` () =
// Setup
let items = [A ({UnitPrice=50; Qty=2} , ThreeForOneThirty)
B ({UnitPrice=30; Qty=1} , TwoForFourtyFive)
A ({UnitPrice=50; Qty=1} , ThreeForOneThirty)]
items |> totalPrice |> should equal 160
Your data is badly defined for your use case. If you want to refer to the kinds of items by themselves, you need to define them by themselves:
type ItemKind = A | B | C | D
type Item = { Kind: ItemKind; Price: TotalPrice; Special: Special option }
Then you can easily build a dictionary of items:
let dictionary = items |> List.map (fun i -> i.Kind, i) |> dict
Although I must note that such dictionary may not be possible: if the items list contains several items of the same kind, some of them will not be included in the dictionary, because it can't contain multiple identical keys. Perhaps I didn't understand what kind of dictionary you're after.
If you want to create the dictionary with keys like A, B, C and D you will fail because A and B are constructors with type TotalPrice * Special -> Item and C and D are constructors of type TotalPrice -> Item. Dictionary would have to make a decision about type of keys.
Getting DU constructor name should be doable by reflection but is it really necessary for your case?
Maybe different type structure will be more efficient for your case, ie. Fyodor Soikin proposal.
Maybe the following will clarify somewhat why datastructure and code is no good, and as such also clarify that this mainly is not related to FP as indicated in some of the comments et al.
My guess is that the question is related to "how can this be grouped", and lo and behold, there is in fact a groupBy function!
(*Types*)
type UnitPrice = int
type Qty = int
type Special =
| ThreeForOneThirty
| TwoForFourtyFive
type TotalPrice = { UnitPrice:int ; Qty:int }
type Item =
| A of TotalPrice * Special
| B of TotalPrice * Special
| C of TotalPrice
| D of TotalPrice
let items = [A ({UnitPrice=50; Qty=2} , ThreeForOneThirty)
B ({UnitPrice=30; Qty=1} , TwoForFourtyFive)
A ({UnitPrice=50; Qty=1} , ThreeForOneThirty)]
let speciallyStupidTransformation =
function
| ThreeForOneThirty -> 34130
| TwoForFourtyFive -> 2445
let stupidTransformation =
function
| A (t,s) -> "A" + (s |> speciallyStupidTransformation |> string)
| B (t,s) -> "B" + (s |> speciallyStupidTransformation |> string)
| C (t) -> "C"
| D(t) -> "D"
let someGrouping = items |> List.groupBy(stupidTransformation)
val it : (string * Item list) list =
[("A34130",
[A ({UnitPrice = 50;
Qty = 2;},ThreeForOneThirty); A ({UnitPrice = 50;
Qty = 1;},ThreeForOneThirty)]);
("B2445", [B ({UnitPrice = 30;
Qty = 1;},TwoForFourtyFive)])]
Yeah its still a bad idea. But its somewhat grouped uniquely, and may be misused further to aggregate some sums or whatever.
Adding some more code for that, like the following:
let anotherStupidTransformation =
function
| A(t,_) -> (t.UnitPrice, t.Qty)
| B(t,_) -> (t.UnitPrice, t.Qty)
| C(t) -> (t.UnitPrice, t.Qty)
| D(t) -> (t.UnitPrice, t.Qty)
let x4y x y tp q =
if q%x = 0 then y*q/x else tp/q*(q%x)+(q-q%x)/x*y
let ``34130`` = x4y 3 130
let ``2445`` = x4y 2 45
let getRealStupidTotal =
function
| (s, (tp,q)) ->
(s|> List.ofSeq, (tp,q))
|> function
| (h::t, (tp,q)) ->
match t |> List.toArray |> System.String with
| "34130" -> ``34130`` tp q
| "2445" -> ``2445`` tp q
| _ -> tp
let totalPrice =
items
|> List.groupBy(stupidTransformation)
|> List.map(fun (i, l) -> i,
l
|> List.map(anotherStupidTransformation)
|> List.unzip
||> List.fold2(fun acc e1 e2 ->
((fst acc + e1) * e2, snd acc + e2) ) (0,0))
|> List.map(getRealStupidTotal)
|> List.sum
val totalPrice : int = 160
might or might not yield some test cases correct.
For the above testdata as far as I can read the initial code at least is ok. The sum does get to be 160...
Would I use this code anywhere? Nope.
Is it readable? Nope.
Is it fixable? Not without changing the way the data are structured to avoid several of the stupid transformations...

How to write an F# union type chooser?

Is there a better way to do this if F#?
type T =
| A of int
| B of string
static member chooseA x = match x with A i -> Some i | _ -> None
static member chooseB x = match x with B s -> Some s | _ -> None
The usecase is the following:
let collection = [A 10; B "abc"]
let aItems = collection |> Seq.choose T.chooseA
let bItems = collection |> Seq.choose T.chooseB
Thanks!
Use List.partition to split your source elements:
type T =
| A of int
| B of string
let collection = [A 10; B "abc"; A 40; B "120"]
let result = List.partition (function | A _ -> true | _ -> false) collection
val result : T list * T list = ([A 10; A 40], [B "abc"; B "120"])
Then you can use fst and snd to select the relevant lists.
This is awkward, but I can see why it is not an important case F#'s design. Usually, there is a solution that allows for a complete pattern match instead of multiple, somewhat incomplete ones. For example, the two concrete item sequences can be constructed like this:
let aItems, bItems =
let accA, accB = ResizeArray(), ResizeArray()
collection |> Seq.iter (function A i -> accA.Add i | B s -> accB.Add s)
seq accA, seq accB
A similar solution without mutation can be made if you dislike it, but I see little reason to worry about encapsulated mutation. Note that the results are cast to seq.
This uses pattern matching in the manner it is designed for:
If another case is added to T, a warning will appear in the handling function, which is exactly where editing should continue: determining how to treat the new input case.
The program doesn't needlessly iterate the input multiple times for each kind of input, but rather goes over it once and handles each item when first encountered.
If the above isn't suitable, you can still shorten the question's code a bit by using the function keyword and declaring the chooser function as a lambda. For example:
let aItems = collection |> Seq.choose (function A i -> Some i | _ -> None)
Note that this is lazy, just like the proposal in the question: here, every iteration over aItems will needlessly iterate over all the B cases in the input.
I can offer the following variant:
open System.Reflection
type T =
| A of int
| B of string
let collection = [A 10; B "abc"; A 40; B "120"]
let sp (col: T list) (str:string) =
if col=[] then []
else
let names = "Is" + str
col |> List.filter(fun x-> let t = x.GetType()
if t.GetProperty(names) = null then false
else
t.InvokeMember(names, BindingFlags.GetProperty, null, x, null) :?> bool)
|> List.map(fun y ->
y.GetType().InvokeMember("get_Item", BindingFlags.InvokeMethod, null, y, null))
sp collection "A" |> printfn "%A\n"
sp collection "B" |> printfn "%A\n"
sp collection "C" |> printfn "%A\n"
Print:
[10; 40]
["abc"; "120"]
[]
http://ideone.com/yAytQk
I'm new to F#, so I think that can be done easier

Avoiding code duplication in F#

I have two snippets of code that tries to convert a float list to a Vector3 or Vector2 list. The idea is to take 2/3 elements at a time from the list and combine them as a vector. The end result is a sequence of vectors.
let rec vec3Seq floatList =
seq {
match floatList with
| x::y::z::tail -> yield Vector3(x,y,z)
yield! vec3Seq tail
| [] -> ()
| _ -> failwith "float array not multiple of 3?"
}
let rec vec2Seq floatList =
seq {
match floatList with
| x::y::tail -> yield Vector2(x,y)
yield! vec2Seq tail
| [] -> ()
| _ -> failwith "float array not multiple of 2?"
}
The code looks very similiar and yet there seems to be no way to extract a common portion. Any ideas?
Here's one approach. I'm not sure how much simpler this really is, but it does abstract some of the repeated logic out.
let rec mkSeq (|P|_|) x =
seq {
match x with
| P(p,tail) ->
yield p
yield! mkSeq (|P|_|) tail
| [] -> ()
| _ -> failwith "List length mismatch" }
let vec3Seq =
mkSeq (function
| x::y::z::tail -> Some(Vector3(x,y,z), tail)
| _ -> None)
As Rex commented, if you want this only for two cases, then you probably won't have any problem if you leave the code as it is. However, if you want to extract a common pattern, then you can write a function that splits a list into sub-list of a specified length (2 or 3 or any other number). Once you do that, you'll only use map to turn each list of the specified length into Vector.
The function for splitting list isn't available in the F# library (as far as I can tell), so you'll have to implement it yourself. It can be done roughly like this:
let divideList n list =
// 'acc' - accumulates the resulting sub-lists (reversed order)
// 'tmp' - stores values of the current sub-list (reversed order)
// 'c' - the length of 'tmp' so far
// 'list' - the remaining elements to process
let rec divideListAux acc tmp c list =
match list with
| x::xs when c = n - 1 ->
// we're adding last element to 'tmp',
// so we reverse it and add it to accumulator
divideListAux ((List.rev (x::tmp))::acc) [] 0 xs
| x::xs ->
// add one more value to 'tmp'
divideListAux acc (x::tmp) (c+1) xs
| [] when c = 0 -> List.rev acc // no more elements and empty 'tmp'
| _ -> failwithf "not multiple of %d" n // non-empty 'tmp'
divideListAux [] [] 0 list
Now, you can use this function to implement your two conversions like this:
seq { for [x; y] in floatList |> divideList 2 -> Vector2(x,y) }
seq { for [x; y; z] in floatList |> divideList 3 -> Vector3(x,y,z) }
This will give a warning, because we're using an incomplete pattern that expects that the returned lists will be of length 2 or 3 respectively, but that's correct expectation, so the code will work fine. I'm also using a brief version of sequence expression the -> does the same thing as do yield, but it can be used only in simple cases like this one.
This is simular to kvb's solution but doesn't use a partial active pattern.
let rec listToSeq convert (list:list<_>) =
seq {
if not(List.isEmpty list) then
let list, vec = convert list
yield vec
yield! listToSeq convert list
}
let vec2Seq = listToSeq (function
| x::y::tail -> tail, Vector2(x,y)
| _ -> failwith "float array not multiple of 2?")
let vec3Seq = listToSeq (function
| x::y::z::tail -> tail, Vector3(x,y,z)
| _ -> failwith "float array not multiple of 3?")
Honestly, what you have is pretty much as good as it can get, although you might be able to make a little more compact using this:
// take 3 [1 .. 5] returns ([1; 2; 3], [4; 5])
let rec take count l =
match count, l with
| 0, xs -> [], xs
| n, x::xs -> let res, xs' = take (count - 1) xs in x::res, xs'
| n, [] -> failwith "Index out of range"
// split 3 [1 .. 6] returns [[1;2;3]; [4;5;6]]
let rec split count l =
seq { match take count l with
| xs, ys -> yield xs; if ys <> [] then yield! split count ys }
let vec3Seq l = split 3 l |> Seq.map (fun [x;y;z] -> Vector3(x, y, z))
let vec2Seq l = split 2 l |> Seq.map (fun [x;y] -> Vector2(x, y))
Now the process of breaking up your lists is moved into its own generic "take" and "split" functions, its much easier to map it to your desired type.

Resources