is this a good use of Seq.cache, in F# - f#

I'm going through a mutable ConcurrentDictionary to remove old entries.
let private cache = ConcurrentDictionary<Instrument * DateTimeOffset, SmallSet>()
and since I can't remove entries while iterating through the keys, I was wondering if this would be a good use for Seq.cache:
let old = DateTimeOffset.UtcNow.AddHours(-1.)
cache.Keys
|> Seq.filter (fun x -> snd x <= old)
|> Seq.cache
|> Seq.iter (fun x -> cache.TryRemove x |> ignore)
I have never used Seq.cache, and I assume it creates a separation between the two loops. Am I understanding how it works correctly?

In the scenario you described I don't see any reason why you need to iterate the collection multiple times. You can just go over the KeyValuePairs inside the dictionary and analyze each KeyValuePair if it matches your condition or no.
So, something like this should do the job:
cache |> Seq.iter(function
| x when snd x.Key <= old -> cache.TryRemove(x.Key) |> ignore
| _ -> ())

Related

Does Function Composition rely on Partial Application?

Does Function Composition rely on Partial Application?
Here's my understanding:
Observe the following functions that have some duplication of function calls:
let updateCells (grid:Map<(int * int), Cell>) =
grid |> Map.toSeq
|> Seq.map snd
|> Seq.fold (fun grid c -> grid |> setReaction (c.X, c.Y)) grid
let mutable _cells = ObservableCollection<Cell>( grid |> Map.toSeq
|> Seq.map snd
|> Seq.toList )
let cycleHandler _ =
self.Cells <- ObservableCollection<Cell>( grid |> cycleThroughCells
|> Map.toSeq
|> Seq.map snd
|> Seq.toList )
If you’ve noticed, the following code appears in all three functions:
grid |> Map.toSeq
|> Seq.map snd
Function Composition
Within functional programming, we can fuse functions together so that they can become one function.
To do this, let’s create a new function from the duplicated sequence of functions:
let getCells = Map.toSeq >> Seq.map snd >> Seq.toList
Now if you’re attentive, you will have noticed that we don’t use any arguments when using Function Composition. Hence, the grid value is not used. The reason behind this is because of Partial Application.
Partial Application
I’m still learning all these functional programming techniques. However, my understanding is that partial application is a technique within functional programming that postpones the need to accept a complete set of arguments for a given function. In other words, partial application is the act of deferring the acceptance of a complete set of arguments for a given function in which there is an expectation that the end-client will provide the rest of the arguments later. At least, that’s my understanding.
We can now take a function like:
let updateCells (grid:Map<(int * int), Cell>) =
grid |> Map.toSeq
|> Seq.map snd
|> Seq.fold (fun grid c -> grid |> setReaction (c.X, c.Y)) grid
And refactor it to something like:
let updateCells (grid:Map<(int * int), Cell>) =
grid |> getCells
|> Seq.fold (fun grid c -> grid |> setReaction (c.X, c.Y)) grid
Are my thoughts regarding Function Composition being coupled with Partial Application accurate?
Generics
Actually, if you take the expression
let getCells = Map.toSeq >> Seq.map snd >> Seq.toList
and attempt to compile it as a stand-alone expression, you'll get a compiler error:
error FS0030: Value restriction. The value 'getCells' has been inferred to have generic type
val getCells : (Map<'_a,'_b> -> '_b list) when '_a : comparison
Either make the arguments to 'getCells' explicit or, if you do not intend for it to be generic, add a type annotation.
The reason it works in your case is because you're using the getCells function with grid, which means that the compiler infers it to have a constrained type.
In order to keep it generic, you can rephrase it using an explicit argument:
let getCells xs = xs |> Map.toSeq |> Seq.map snd |> Seq.toList
This expression is a valid stand-alone expression of the type Map<'a,'b> -> 'b list when 'a : comparison.
Point-free
The style used with the >> function composition operator is called point-free. It works well with partial application, but isn't quite the same.
Application
There is, however, an example of partial function application in this example:
let getCells xs = xs |> Map.toSeq |> Seq.map snd |> Seq.toList
The function snd has the following type:
'a * 'b -> 'b
It's function that takes a single argument.
You could also write the above getCells function without partial application of the snd function:
let getCells xs = xs |> Map.toSeq |> Seq.map (fun x -> snd x) |> Seq.toList
Notice that instead of a partially applied function passed to Seq.map, you can pass a lambda expression. The getCells function is still a function composed from other functions, but it no longer relies on partial application of snd.
Thus, to partially (pun intended) answer your question: function composition doesn't have to rely on partial function composition.
Currying
In F#, functions are curried by default. This means that all functions take exactly one argument, and returns a value. Sometimes (often), the return value is another function.
Consider, as an example, the Seq.map function. If you call it with one argument, the return value is another function:
Seq.map snd
This expression has the type seq<'a * 'b> -> seq<'b>, because the return value of Seq.map snd is another function.
Eta reduction
This means that you can perform an Eta reduction on the above lambda expression fun x -> snd x, because x appears on both sides of the expression. The result is simply snd, and the entire expression becomes
let getCells xs = xs |> Map.toSeq |> Seq.map snd |> Seq.toList
As you can see, partial application isn't necessary for function composition, but it does make it much easier.
Impartial
The above composition using the pipe operator (|>) still relies on partial application of the functions Map.toSeq, Seq.map, etcetera. In order to demonstrate that composition doesn't rely on partial application, here's an 'impartial' (the opposite of partial? (pun)) alternative:
let getCells xs =
xs
|> (fun xs' -> Map.toSeq xs')
|> (fun xs' -> Seq.map (fun x -> snd x) xs')
|> (fun xs' -> Seq.toList xs')
Notice that this version makes extensive use of lambda expressions instead of partial application.
I wouldn't compose functions in this way; I only included this alternative to demonstrate that it can be done.
Composition depends on first-class functions, not really on partial applications.
What is required to implement composition is that:
Functions must be able to be taken as arguments and returned as return values
Function signatures must be valid types (if you want the composition to be strongly-typed)
Partial application creates more opportunities for composition, but in principle you can easily define function composition without it.
For example, C# doesn't have partial application*, but you can still compose two functions together, as long as the signatures match:
Func<a, c> Compose<a, b, c>(this Func<a, b> f,
Func<b, c> g)
{
return x => g(f(x));
}
which is just >> with an uglier syntax: f.Compose(g).
However, there is one interesting connection between composition and partial application. The definition of the >> operator is:
let (>>) f g x = g(f(x))
and so, when you write foo >> bar, you are indeed partially applying the (>>) function, ie omitting the x argument to get the fun x = g(f(x)) partial result.
But, as I said above, this isn't strictly necessary. The Compose function above is equivalent to F#'s >> operator and doesn't involve any partial application; lambdas perform the same role in a slightly more verbose way.
* Unless you manually implement it, which nobody does. I.e. instead of writing
string foo(int a, int b)
{
return (a + b).ToString();
}
you'd have to write
Func<int, string> foo(int a)
{
return b => (a + b).ToString();
}
and then you'd be able to pass each argument separately just like in F#.

How can I get the intermediate results from each step of a multi-step pipeline function?

I have a code which looks like this:
this.GetItemTypeIdsAsListForOneItemTypeIdTreeUpIncludeItemType itemType.AutoincrementedId
|> Array.map (fun i -> i.AutoincrementedId)
|> Array.map (BusinessLogic.EntityTypes.getFullSetOfEntityTypeFieldValuesForItemTypeAid item.Autoincrementedid)
|> Array.fold Array.append [||]
|> Array.map (fun fv -> { fv with ReferenceAutoId = aid } )
|> Array.toSeq
|> Seq.distinctBy (fun fv -> fv.Fieldname)
|> Seq.toArray
Sometimes such code gets the unusual result which I need to explain. Usually there is not error in the code. There is an error in the data. And I need to explain why this backet of data is incorrect. What is the best way to do it ?
I just want to look at the list on each step of this expression.
Something like:
func data
|> func2 && Console.WriteLine
|> func3 && Console.WriteLine
....
Get input, split it on two. Pass one of the output to the next function, and second output to Console.
For a quick and dirty solution, you can always add a function like this one:
// ('a -> unit) -> 'a -> 'a
let tee f x = f x; x
If, for example, you have a composition like this:
[1..10]
|> List.map string
|> String.concat "|"
you can insert tee in order to achieve a side-effect:
[1..10]
|> List.map string
|> tee (printfn "%A")
|> String.concat "|"
That's not functional, but can be used in a pinch if you just need to look at some intermediate values; e.g. for troubleshooting.
Otherwise, for a 'proper' functional solution, perhaps application of the State monad might be appropriate. That will enable you to carry around state while performing the computation. The state could, for example, contain custom messages collected along the way...
If you just want to 'exit' as soon as you discover that something is wrong, though, then the Either monad is the appropriate way to go.

F#: Generating a word count summary

I am new to programming and F# is my first .NET language.
I would like to read the contents of a text file, count the number of occurrences of each word, and then return the 10 most common words and the number of times each of them appears.
My questions are: Is using a dictionary encouraged in F#? How would I write the code if I wish to use a dictionary? (I have browsed through the Dictionary class on MSDN, but I am still puzzling over how I can update the value to a key.) Do I always have to resort to using Map in functional programming?
While there's nothing wrong with the other answers, I'd like to point out that there's already a specialized function to get the number of unique keys in a sequence: Seq.countBy. Plumbing the relevant parts of Reed's and torbonde's answers together:
let countWordsTopTen (s : string) =
s.Split([|','|])
|> Seq.countBy (fun s -> s.Trim())
|> Seq.sortBy (snd >> (~-))
|> Seq.truncate 10
"one, two, one, three, four, one, two, four, five"
|> countWordsTopTen
|> printfn "%A" // seq [("one", 3); ("two", 2); ("four", 2); ("three", 1); ...]
My questions are: Is using a dictionary encouraged in F#?
Using a Dictionary is fine from F#, though it does use mutability, so it's not quite as common.
How would I write the code if I wish to use a dictionary?
If you read the file, and have a string with comma separated values, you could
parse using something similar to:
// Just an example of input - this would come from your file...
let strings = "one, two, one, three, four, one, two, four, five"
let words =
strings.Split([|','|])
|> Array.map (fun s -> s.Trim())
let dict = Dictionary<_,_>()
words
|> Array.iter (fun w ->
match dict.TryGetValue w with
| true, v -> dict.[w] <- v + 1
| false, _ -> dict.[w] <- 1)
// Creates a sequence of tuples, with (word,count) in order
let topTen =
dict
|> Seq.sortBy (fun kvp -> -kvp.Value)
|> Seq.truncate 10
|> Seq.map (fun kvp -> kvp.Key, kvp.Value)
I would say an obvious choice for this task is to use the Seq module, which is really one of the major workhorses in F#. As Reed said, using dictionary is not as common, since it is mutable. Sequences, on the other hand, are immutable. An example of how to do this using sequences is
let strings = "one, two, one, three, four, one, two, four, five"
let words =
strings.Split([|','|])
|> Array.map (fun s -> s.Trim())
let topTen =
words
|> Seq.groupBy id
|> Seq.map (fun (w, ws) -> (w, Seq.length ws))
|> Seq.sortBy (snd >> (~-))
|> Seq.truncate 10
I think the code speaks pretty much for itself, although maybe the second last line requires a short explanation:
The snd-function gives the second entry in a pair (i.e. snd (a,b) is b), >> is the functional composition operator (i.e. (f >> g) a is the same as g (f a)) and ~- is the unary minus operator. Note here that operators are essentially functions, but when using (and declaring) them as functions, you have to wrap them in parentheses. That is, -3 is the same as (~-) 3, where in the last case we have used the operator as a function.
In total, what the second last line does, is sort the sequence by the negative value of the second entry in the pair (the number of occurrences).

Some basic seq and list questions [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Linked list partition function and reversed results
Actually I don't care about the input type or the output type, any of seq, array, list will do. (It doesn't have to be generic) Currently my code takes list as input and (list * list) as output
let takeWhile predicator list =
let rec takeWhileRec newList remain =
match remain with
| [] -> (newList |> List.rev, remain)
| x::xs -> if predicator x then
takeWhileRec (x::newList) xs
else
(newList |> List.rev, remain)
takeWhileRec [] list
However, there is a pitfall. As fas as I see, List.rev is O(n^2), which would likely to dominate the overall speed? I think it is even slower than the ugly solution: Seq.takeWhile, then count, and then take tail n times... which is still O(n)
(If there is a C# List, then i would use that without having to reverse it...)
A side question, what's difference between Array.ofList and List.toArray , or more generally, A.ofB and B.ofA in List, Seq, Array?
is seq myList identical to List.toSeq myList?
Another side question, is nested Seq.append have same complexity as Seq.concat?
e.g.
Seq.append (Seq.append (Seq.append a b) c) d // looks aweful
Seq.concat [a;b;c;d]
1)The relevant implementation of List.rev is in local.fs in the compiler - it is
// optimized mutation-based implementation. This code is only valid in fslib, where mutation of private
// tail cons cells is permitted in carefully written library code.
let rec revAcc xs acc =
match xs with
| [] -> acc
| h::t -> revAcc t (h::acc)
let rev xs =
match xs with
| [] -> xs
| [_] -> xs
| h1::h2::t -> revAcc t [h2;h1]
The comment does seem odd as there is no obvious mutation. Note that this is in fact O(n) not O(n^2)
2) As pad said there is no difference - I prefer to use the to.. as I think
A
|> List.map ...
|> List.toArray
looks nicer than
A
|> List.map ...
|> Array.ofList
but that is just me.
3)
Append (compiler source):
[<CompiledName("Append")>]
let append (source1: seq<'T>) (source2: seq<'T>) =
checkNonNull "source1" source1
checkNonNull "source2" source2
fromGenerator(fun () -> Generator.bindG (toGenerator source1) (fun () -> toGenerator source2))
Note that for each append we get an extra generator that has to be walked through. In comparison, the concat implementation will just have 1 single extra function rather than n so using concat is probably better.
To answer your questions:
1) Time complexity of List.rev is O(n) and worst-case complexity of takeWhile is also O(n). So using List.rev doesn't increase complexity of the function. Using ResizeArray could help you avoid List.rev, but you have to tolerate a bit of mutation.
let takeWhile predicate list =
let rec loop (acc: ResizeArray<_>) rest =
match rest with
| x::xs when predicate x -> acc.Add(x); loop acc xs
| _ -> (acc |> Seq.toList, rest)
loop (ResizeArray()) list
2) There is no difference. Array.ofList and List.toArray uses the same function internally (see here and here).
3). I think Seq.concat has the same complexity with a bunch of Seq.append. In the context of List andArray, concat is more efficient than append because you have more information to pre-allocate space for outputs.
how about this:
let takeWhile pred =
let cont = ref true
List.partition (pred >> fun r -> !cont && (cont := r; r))
It uses a single library function, List.partition, which is efficiently implemented.
Hope this is what you meant :)

Is there a more generic way of iterating,filtering, applying operations on collections in F#?

Let's take this code:
open System
open System.IO
let lines = seq {
use sr = new StreamReader(#"d:\a.h")
while not sr.EndOfStream do yield sr.ReadLine()
}
lines |> Seq.iter Console.WriteLine
Console.ReadLine()
Here I am reading all the lines in a seq, and to go over it, I am using Seq.iter. If I have a list I would be using List.iter, and if I have an array I would be using Array.iter. Isn't there a more generic traversal function I could use, instead of having to keep track of what kind of collection I have? For example, in Scala, I would just call a foreach and it would work regardless of the fact that I am using a List, an Array, or a Seq.
Am I doing it wrong?
You may or may not need to keep track of what type of collection you deal with, depending on your situation.
In case of simple iterating over items nothing may prevent you from using Seq.iter on lists or arrays in F#: it will work over arrays as well as over lists as both are also sequences, or IEnumerables from .NET standpoint. Using Array.iter over an array, or List.iter over a list would simply offer more effective implementations of traversal based on specific properties of each type of collection. As the signature of Seq.iter Seq.iter : ('T -> unit) -> seq<'T> -> unit shows you do not care about your type 'T after the traversal.
In other situations you may want to consider types of input and output arguments and use specialized functions, if you care about further composition. For example, if you need to filter a list and continue using result, then
List.filter : ('T -> bool) -> 'T list -> 'T list will preserve you the type of underlying collection intact, but Seq.filter : ('T -> bool) -> seq<'T> -> seq<'T> being applied to a list will return you a sequence, not a list anymore:
let alist = [1;2;3;4] |> List.filter (fun x -> x%2 = 0) // alist is still a list
let aseq = [1;2;3;4] |> Seq.filter (fun x -> x%2 = 0) // aseq is not a list anymore
Seq.iter works on lists and arrays just as well.
The type seq is actually an alias for the interface IEnumerable<'T>, which list and array both implement. So, as BLUEPIXY indicated, you can use Seq.* functions on arrays or lists.
A less functional-looking way would be the following:
for x in [1..10] do
printfn "%A" x
List and Array is treated as Seq.
let printAll seqData =
seqData |> Seq.iter (printfn "%A")
Console.ReadLine() |> ignore
printAll lines
printAll [1..10]
printAll [|1..10|]

Resources