With leftOuterJoin, .DefaultIfEmpty() is unnecessary - f#

The documentation for leftOuterJoin Query Expressions on MSDN repeatedly implies through the samples that when using leftOuterJoin .. on .. into .. that you must still use .DefaultIfEmpty() to achieve the desired effect.
I don't believe this is necessary because I get the same results in both of these tests which differ only in that the second one does not .DefaultIfEpmty()
type Test = A | B | C
let G = [| A; B; C|]
let H = [| A; C; C|]
printfn "%A" <| query {
for g in G do
leftOuterJoin h in H on (g = h) into I
for i in I.DefaultIfEmpty() do
select (g, i)}
printfn "%A" <| query {
for g in G do
leftOuterJoin h in H on (g = h) into I
for i in I do
select (g, i)}
// seq [(A, A); (B, null); (C, C); (C, C)]
// seq [(A, A); (B, null); (C, C); (C, C)]
1) Can you confirm this?
If that's right, I realized it only after writing this alternate type augmentation in an attempt to better deal with unmatched results and I was surprised to still see nulls in my output!
type IEnumerable<'TSource> with
member this.NoneIfEmpty = if (Seq.exists (fun _ -> true) this)
then Seq.map (fun e -> Some e) this
else seq [ None ]
printfn "%A" <| query {
for g in G do
leftOuterJoin h in H on (g = h) into I
for i in I.NoneIfEmpty do
select (g, i)}
// seq [(A, Some A); (B, Some null); (C, Some C); (C, Some C)]
2) Is there a way to get None instead of null/Some null from the leftOuterJoin?
3) What I really want to do is find out if there are any unmatched g
printfn "%A" <| query {
for g in G do
leftOuterJoin h in H on (g = h) into I
for i in I.NoneIfEmpty do
where (i.IsNone)
exists (true) }
I figured this next one out but it isn't very F#:
printfn "%A" <| query {
for g in G do
leftOuterJoin h in H on (g = h) into I
for i in I do
where (box i = null)
exists (true)}

Short version: Query Expressions use nulls. It's a rough spot in the language, but a containable one.
I've done this before:
let ToOption (a:'a) =
match obj.ReferenceEquals(a,null) with
| true -> None
| false -> Some(a)
This will let you do:
printfn "%A" <| query {
for g in G do
leftOuterJoin h in H on (g = h) into I
for i in I do
select ( g,(ToOption i))}
Which wraps every result in an option (since you don't know if there is going to be an I. It's worth noting that F# uses null to represent None at run-time as an optimization. So to check if this is indeed what you want, make a decision on the option, like:
Seq.iter (fun (g,h) ->
printf "%A," g;
match h with
| Some(h) -> printfn "Some (%A)" h
| None -> printfn "None")
<| query {
for g in G do
leftOuterJoin h in H on (g = h) into I
for i in I do
select ((ToOption g),(ToOption i))}

Related

F# Matching results of recursive calls using higher order functions

Given a simple function, where we do pattern matching on the result of a recursive call, such as:
let rec sumProd = function
| [] -> (0,1)
| x::rest -> let (rSum,rProd) = sumProd rest
(x + rSum,x * rProd)
sumProd [2;5] //Expected (7, 10)
How would I go about changing it into something using higher order functions, e.g. foldBack?
let sumProdHigherOrder lst =
List.foldBack (fun x acc -> (acc + x, acc * x)) lst (0,0)
The above seemed almost like the way to do it, but calling it gives the error: The type 'int' does not match the type 'int * int'
sumProdHigherOrder [2;5] //Expected (7, 10)
What am I missing?
Your missing the tuple functions fst and snd:
List.foldBack (fun x acc -> (fst acc + x, snd acc * x)) [2;5] (0,1)
// val it : int * int = (7, 10)
Or even better, decomposing the tuple at the lambda. I see you just found it:
List.foldBack (fun x (s, m) -> (s + x, m * x)) [2;5] (0,1)
Also note that since the operations are commutative you can do a straight fold:
List.fold (fun (s, m) x -> (s + x, m * x)) (0,1) [2;5]
It will be more efficient.
Right! Of course it shouldn't be the same accumulator that gets passed through the list. After staring intensely at the code for some minutes, I figured it out:
let sumProdHigherOrder lst =
List.foldBack (fun x (acc,acc') -> (acc + x, acc' * x)) lst (0,1)

F# Polynomial Derivator

I'm writing a program that takes a polynomial and returns its derivative. The polynomial is passed as predefined type "poly", which is a list of tuples in which the first element is a float representing a coefficient, and the second is an integer representing the degree of that term. So a poly p = [(2.0, 3);(1.5,2);(3.2;1)] would represent 2x^3 + 1.5x^2 + 3.2x^1. My code is as follows:
let rec diff (p:poly):poly =
match p with
| [] -> raise EmptyList
| [a]-> (fst a * snd a, snd a - 1)
| x::xs -> ((fst x * snd x), (snd x - 1)) :: diff xs
The error I'm getting tells me that the program expects the function to return a type poly, but here has the type 'a * 'b. I don't see why thats the case, when in my base case I return a tuple and in all other situations I'm appending onto an accumulating list. I've played around with the brackets, to no avail. Why is my code tossing this error?
All input is appreciated on the matter.
you said it yourself: in the base case you are returning a tuple not a list - so the inference thinks this is what you want
Just change it into:
let rec diff (p:poly):poly =
match p with
| [] -> raise EmptyList
| [a]-> [fst a * snd a, snd a - 1]
| x::xs -> ((fst x * snd x), (snd x - 1)) :: diff xs
and it should be fine (just replace the (..) with [..] ;) )
remember: :: will cons a new head onto a list
there are a few issues with float vs. int there so I would suggest this (using recursion):
type Poly = (float*int) list
let test : Poly = [(2.0, 3);(1.5,2);(3.2,1);(1.0,0)]
let rec diff (p:Poly):Poly =
match p with
| [] -> []
| x::xs -> (fst x * float (snd x), snd x - 1) :: diff xs
which is really just this:
let diff : Poly -> Poly =
List.map (fun x -> fst x * float (snd x), snd x - 1)
and can look a lot nicer without fst and snd:
let diff : Poly -> Poly =
List.map (fun (a,p) -> a * float p, p - 1)

Squashing tuples from (a,(b,c)) to (a,b,c) in fsharp

Does it make sense to have such functions defined
let squash12 (e:('a*('b*'c) )) = e |> (fun (a,(b,c) ) -> (a,b,c ))
let squash21 (e:(('a*'b)*'c )) = e |> (fun ((a,b),c ) -> (a,b,c ))
let squash13 (e:('a*('b*'c*'d))) = e |> (fun (a,(b,c,d)) -> (a,b,c,d))
let seqsquash12 (sa:seq<'T>) = sa |> Seq.map squash12
let seqsquash21 (sa:seq<'T>) = sa |> Seq.map squash21
let seqsquash13 (sa:seq<'T>) = sa |> Seq.map squash13
I could not find another way to make my core code recursive (leading to nested tuples), yet expose simple function that maps to generalized n-dimensional coordinates.
I would have marked your functions as inline so that they could just be
let inline squash1 (a,(b,c)) = (a,b,c)
Also, you don't need the lambdas (fun ...)
Yes, it makes sense to do so. The suggestion is avoiding lambda to make these functions easier to read:
let squash12 (a, (b, c)) = a, b, c
If you encounter inner tuples with varied arity very often, converting them into lists is not a bad idea. For example, e becomes a tuple of two lists:
(a, (b, c)) ~> ([a], [b; c])
(a, b), c) ~> ([a; b], [c])
(a, (b, c, d)) ~> (a, [b; c; d])
And we only need one function for sequence:
let seqsquash sa = sa |> Seq.map (#)
The problem is you lose the control over size of input. Pattern matching on list could help:
let squash12 (xs, ys) =
match xs, ys with
| [a], [b; c] -> xs, ys
| _ -> failwith "Wrong input size"

F# -> Fold with 2 parameters

I'm trying to make a custom fold which goes through my sequence, and takes 2 Teams a time and assign them to a Match and then return a Match list in the end.
My current code is:
let myFold f s =
let rec myFold' f s acc =
match s with
| (a1::a2::a) -> f a1 a2::acc
| _ -> acc
myFold' f s []
Which gives me (int -> int) list
But obviously thats not going to work... What am I doing wrong?
-> I know that I just can create a recrusive function special made for this scenario, however I want to make it so abstract as possible for reuse.
Im' not quite sure that I get what you want to achieve. From sequence [1;2;3;4] you want to get [(1,2); (3,4)] or [(1,2); (2,3); (3,4)] ?
let fold f s =
let rec impl acc = function
| x::y::rest -> impl ((f x y)::acc) rest
| _ -> List.rev acc
impl [] s
let s = [1;2;3;4;5;6]
let r = fold (fun x y -> x,y) s // [(1, 2); (3, 4); (5, 6)]
let fold2 f s = Seq.pairwise s |> Seq.map f |> Seq.toList
let r2 = fold2 id s // [(1, 2); (2, 3); (3, 4); (4, 5); (5, 6)]

Combine memoization and tail-recursion

Is it possible to combine memoization and tail-recursion somehow? I'm learning F# at the moment and understand both concepts but can't seem to combine them.
Suppose I have the following memoize function (from Real-World Functional Programming):
let memoize f = let cache = new Dictionary<_, _>()
(fun x -> match cache.TryGetValue(x) with
| true, y -> y
| _ -> let v = f(x)
cache.Add(x, v)
v)
and the following factorial function:
let rec factorial(x) = if (x = 0) then 1 else x * factorial(x - 1)
Memoizing factorial isn't too difficult and making it tail-recursive isn't either:
let rec memoizedFactorial =
memoize (fun x -> if (x = 0) then 1 else x * memoizedFactorial(x - 1))
let tailRecursiveFactorial(x) =
let rec factorialUtil(x, res) = if (x = 0)
then res
else let newRes = x * res
factorialUtil(x - 1, newRes)
factorialUtil(x, 1)
But can you combine memoization and tail-recursion? I made some attempts but can't seem to get it working. Or is this simply not possible?
As always, continuations yield an elegant tailcall solution:
open System.Collections.Generic
let cache = Dictionary<_,_>() // TODO move inside
let memoizedTRFactorial =
let rec fac n k = // must make tailcalls to k
match cache.TryGetValue(n) with
| true, r -> k r
| _ ->
if n=0 then
k 1
else
fac (n-1) (fun r1 ->
printfn "multiplying by %d" n //***
let r = r1 * n
cache.Add(n,r)
k r)
fun n -> fac n id
printfn "---"
let r = memoizedTRFactorial 4
printfn "%d" r
for KeyValue(k,v) in cache do
printfn "%d: %d" k v
printfn "---"
let r2 = memoizedTRFactorial 5
printfn "%d" r2
printfn "---"
// comment out *** line, then run this
//let r3 = memoizedTRFactorial 100000
//printfn "%d" r3
There are two kinds of tests. First, this demos that calling F(4) caches F(4), F(3), F(2), F(1) as you would like.
Then, comment out the *** printf and uncomment the final test (and compile in Release mode) to show that it does not StackOverflow (it uses tailcalls correctly).
Perhaps I'll generalize out 'memoize' and demonstrate it on 'fib' next...
EDIT
Ok, here's the next step, I think, decoupling memoization from factorial:
open System.Collections.Generic
let cache = Dictionary<_,_>() // TODO move inside
let memoize fGuts n =
let rec newFunc n k = // must make tailcalls to k
match cache.TryGetValue(n) with
| true, r -> k r
| _ ->
fGuts n (fun r ->
cache.Add(n,r)
k r) newFunc
newFunc n id
let TRFactorialGuts n k memoGuts =
if n=0 then
k 1
else
memoGuts (n-1) (fun r1 ->
printfn "multiplying by %d" n //***
let r = r1 * n
k r)
let memoizedTRFactorial = memoize TRFactorialGuts
printfn "---"
let r = memoizedTRFactorial 4
printfn "%d" r
for KeyValue(k,v) in cache do
printfn "%d: %d" k v
printfn "---"
let r2 = memoizedTRFactorial 5
printfn "%d" r2
printfn "---"
// comment out *** line, then run this
//let r3 = memoizedTRFactorial 100000
//printfn "%d" r3
EDIT
Ok, here's a fully generalized version that seems to work.
open System.Collections.Generic
let memoize fGuts =
let cache = Dictionary<_,_>()
let rec newFunc n k = // must make tailcalls to k
match cache.TryGetValue(n) with
| true, r -> k r
| _ ->
fGuts n (fun r ->
cache.Add(n,r)
k r) newFunc
cache, (fun n -> newFunc n id)
let TRFactorialGuts n k memoGuts =
if n=0 then
k 1
else
memoGuts (n-1) (fun r1 ->
printfn "multiplying by %d" n //***
let r = r1 * n
k r)
let facCache,memoizedTRFactorial = memoize TRFactorialGuts
printfn "---"
let r = memoizedTRFactorial 4
printfn "%d" r
for KeyValue(k,v) in facCache do
printfn "%d: %d" k v
printfn "---"
let r2 = memoizedTRFactorial 5
printfn "%d" r2
printfn "---"
// comment out *** line, then run this
//let r3 = memoizedTRFactorial 100000
//printfn "%d" r3
let TRFibGuts n k memoGuts =
if n=0 || n=1 then
k 1
else
memoGuts (n-1) (fun r1 ->
memoGuts (n-2) (fun r2 ->
printfn "adding %d+%d" r1 r2 //%%%
let r = r1+r2
k r))
let fibCache, memoizedTRFib = memoize TRFibGuts
printfn "---"
let r5 = memoizedTRFib 4
printfn "%d" r5
for KeyValue(k,v) in fibCache do
printfn "%d: %d" k v
printfn "---"
let r6 = memoizedTRFib 5
printfn "%d" r6
printfn "---"
// comment out %%% line, then run this
//let r7 = memoizedTRFib 100000
//printfn "%d" r7
The predicament of memoizing tail-recursive functions is, of course, that when tail-recursive function
let f x =
......
f x1
calls itself, it is not allowed to do anything with a result of the recursive call, including putting it into cache. Tricky; so what can we do?
The critical insight here is that since the recursive function is not allowed to do anything with a result of recursive call, the result for all arguments to recursive calls will be the same! Therefore if recursion call trace is this
f x0 -> f x1 -> f x2 -> f x3 -> ... -> f xN -> res
then for all x in x0,x1,...,xN the result of f x will be the same, namely res. So the last invocation of a recursive function, the non-recursive call, knows the results for all the previous values - it is in a position to cache them. The only thing you need to do is to pass a list of visited values to it. Here is what it might look for factorial:
let cache = Dictionary<_,_>()
let rec fact0 l ((n,res) as arg) =
let commitToCache r =
l |> List.iter (fun a -> cache.Add(a,r))
match cache.TryGetValue(arg) with
| true, cachedResult -> commitToCache cachedResult; cachedResult
| false, _ ->
if n = 1 then
commitToCache res
cache.Add(arg, res)
res
else
fact0 (arg::l) (n-1, n*res)
let fact n = fact0 [] (n,1)
But wait! Look - l parameter of fact0 contains all the arguments to recursive calls to fact0 - just like the stack would in a non-tail-recursive version! That is exactly right. Any non-tail recursive algorithm can be converted to a tail-recursive one by moving the "list of stack frames" from stack to heap and converting the "postprocessing" of recursive call result into a walk over that data structure.
Pragmatic note: The factorial example above illustrates a general technique. It is quite useless as is - for factorial function it is quite enough to cache the top-level fact n result, because calculation of fact n for a particular n only hits a unique series of (n,res) pairs of arguments to fact0 - if (n,1) is not cached yet, then none of the pairs fact0 is going to be called on are.
Note that in this example, when we went from non-tail-recursive factorial to a tail-recursive factorial, we exploited the fact that multiplication is associative and commutative - tail-recursive factorial execute a different set of multiplications than a non-tail-recursive one.
In fact, a general technique exists for going from non-tail-recursive to tail-recursive algorithm, which yields an algorithm equivalent to a tee. This technique is called "continuatuion-passing transformation". Going that route, you can take a non-tail-recursive memoizing factorial and get a tail-recursive memoizing factorial by pretty much a mechanical transformation. See Brian's answer for exposition of this method.
I'm not sure if there's a simpler way to do this, but one approach would be to create a memoizing y-combinator:
let memoY f =
let cache = Dictionary<_,_>()
let rec fn x =
match cache.TryGetValue(x) with
| true,y -> y
| _ -> let v = f fn x
cache.Add(x,v)
v
fn
Then, you can use this combinator in lieu of "let rec", with the first argument representing the function to call recursively:
let tailRecFact =
let factHelper fact (x, res) =
printfn "%i,%i" x res
if x = 0 then res
else fact (x-1, x*res)
let memoized = memoY factHelper
fun x -> memoized (x,1)
EDIT
As Mitya pointed out, memoY doesn't preserve the tail recursive properties of the memoee. Here's a revised combinator which uses exceptions and mutable state to memoize any recursive function without overflowing the stack (even if the original function is not itself tail recursive!):
let memoY f =
let cache = Dictionary<_,_>()
fun x ->
let l = ResizeArray([x])
while l.Count <> 0 do
let v = l.[l.Count - 1]
if cache.ContainsKey(v) then l.RemoveAt(l.Count - 1)
else
try
cache.[v] <- f (fun x ->
if cache.ContainsKey(x) then cache.[x]
else
l.Add(x)
failwith "Need to recurse") v
with _ -> ()
cache.[x]
Unfortunately, the machinery which is inserted into each recursive call is somewhat heavy, so performance on un-memoized inputs requiring deep recursion can be a bit slow. However, compared to some other solutions, this has the benefit that it requires fairly minimal changes to the natural expression of recursive functions:
let fib = memoY (fun fib n ->
printfn "%i" n;
if n <= 1 then n
else (fib (n-1)) + (fib (n-2)))
let _ = fib 5000
EDIT
I'll expand a bit on how this compares to other solutions. This technique takes advantage of the fact that exceptions provide a side channel: a function of type 'a -> 'b doesn't actually need to return a value of type 'b, but can instead exit via an exception. We wouldn't need to use exceptions if the return type explicitly contained an additional value indicating failure. Of course, we could use the 'b option as the return type of the function for this purpose. This would lead to the following memoizing combinator:
let memoO f =
let cache = Dictionary<_,_>()
fun x ->
let l = ResizeArray([x])
while l.Count <> 0 do
let v = l.[l.Count - 1]
if cache.ContainsKey v then l.RemoveAt(l.Count - 1)
else
match f(fun x -> if cache.ContainsKey x then Some(cache.[x]) else l.Add(x); None) v with
| Some(r) -> cache.[v] <- r;
| None -> ()
cache.[x]
Previously, our memoization process looked like:
fun fib n ->
printfn "%i" n;
if n <= 1 then n
else (fib (n-1)) + (fib (n-2))
|> memoY
Now, we need to incorporate the fact that fib should return an int option instead of an int. Given a suitable workflow for option types, this could be written as follows:
fun fib n -> option {
printfn "%i" n
if n <= 1 then return n
else
let! x = fib (n-1)
let! y = fib (n-2)
return x + y
} |> memoO
However, if we're willing to change the return type of the first parameter (from int to int option in this case), we may as well go all the way and just use continuations in the return type instead, as in Brian's solution. Here's a variation on his definitions:
let memoC f =
let cache = Dictionary<_,_>()
let rec fn n k =
match cache.TryGetValue(n) with
| true, r -> k r
| _ ->
f fn n (fun r ->
cache.Add(n,r)
k r)
fun n -> fn n id
And again, if we have a suitable computation expression for building CPS functions, we can define our recursive function like this:
fun fib n -> cps {
printfn "%i" n
if n <= 1 then return n
else
let! x = fib (n-1)
let! y = fib (n-2)
return x + y
} |> memoC
This is exactly the same as what Brian has done, but I find the syntax here is easier to follow. To make this work, all we need are the following two definitions:
type CpsBuilder() =
member this.Return x k = k x
member this.Bind(m,f) k = m (fun a -> f a k)
let cps = CpsBuilder()
I wrote a test to visualize the memoization. Each dot is a recursive call.
......720 // factorial 6
......720 // factorial 6
.....120 // factorial 5
......720 // memoizedFactorial 6
720 // memoizedFactorial 6
120 // memoizedFactorial 5
......720 // tailRecFact 6
720 // tailRecFact 6
.....120 // tailRecFact 5
......720 // tailRecursiveMemoizedFactorial 6
720 // tailRecursiveMemoizedFactorial 6
.....120 // tailRecursiveMemoizedFactorial 5
kvb's solution returns the same results are straight memoization like this function.
let tailRecursiveMemoizedFactorial =
memoize
(fun x ->
let rec factorialUtil x res =
if x = 0 then
res
else
printf "."
let newRes = x * res
factorialUtil (x - 1) newRes
factorialUtil x 1
)
Test source code.
open System.Collections.Generic
let memoize f =
let cache = new Dictionary<_, _>()
(fun x ->
match cache.TryGetValue(x) with
| true, y -> y
| _ ->
let v = f(x)
cache.Add(x, v)
v)
let rec factorial(x) =
if (x = 0) then
1
else
printf "."
x * factorial(x - 1)
let rec memoizedFactorial =
memoize (
fun x ->
if (x = 0) then
1
else
printf "."
x * memoizedFactorial(x - 1))
let memoY f =
let cache = Dictionary<_,_>()
let rec fn x =
match cache.TryGetValue(x) with
| true,y -> y
| _ -> let v = f fn x
cache.Add(x,v)
v
fn
let tailRecFact =
let factHelper fact (x, res) =
if x = 0 then
res
else
printf "."
fact (x-1, x*res)
let memoized = memoY factHelper
fun x -> memoized (x,1)
let tailRecursiveMemoizedFactorial =
memoize
(fun x ->
let rec factorialUtil x res =
if x = 0 then
res
else
printf "."
let newRes = x * res
factorialUtil (x - 1) newRes
factorialUtil x 1
)
factorial 6 |> printfn "%A"
factorial 6 |> printfn "%A"
factorial 5 |> printfn "%A\n"
memoizedFactorial 6 |> printfn "%A"
memoizedFactorial 6 |> printfn "%A"
memoizedFactorial 5 |> printfn "%A\n"
tailRecFact 6 |> printfn "%A"
tailRecFact 6 |> printfn "%A"
tailRecFact 5 |> printfn "%A\n"
tailRecursiveMemoizedFactorial 6 |> printfn "%A"
tailRecursiveMemoizedFactorial 6 |> printfn "%A"
tailRecursiveMemoizedFactorial 5 |> printfn "%A\n"
System.Console.ReadLine() |> ignore
That should work if mutual tail recursion through y are not creating stack frames:
let rec y f x = f (y f) x
let memoize (d:System.Collections.Generic.Dictionary<_,_>) f n =
if d.ContainsKey n then d.[n]
else d.Add(n, f n);d.[n]
let rec factorialucps factorial' n cont =
if n = 0I then cont(1I) else factorial' (n-1I) (fun k -> cont (n*k))
let factorialdpcps =
let d = System.Collections.Generic.Dictionary<_, _>()
fun n -> y (factorialucps >> fun f n -> memoize d f n ) n id
factorialdpcps 15I //1307674368000

Resources