how to apply a function to each group? - f#

I need to apply a function to each group represented by (<seq>) in the below code, but have the function treat each group as new data. So the function would reset after it has iterated through each group and then output the newly transformed groups.
let datafilter =
datacsv
|> List.groupBy (fun row -> row.Class,row.Room)
|> List.iter (fun ((class,room),rows) -> func rows )
(* Outputs [((112, 1), <seq>); ((113, 2), <seq>); ((114, 3), <seq>);
((115, 4), <seq>); ((116, 5), <seq>); ((117, 6), <seq>);
((118, 7), <seq>); ((119, 8), <seq>); ((120, 9), <seq>);
((121, 10), <seq>); ((122, 11), <seq>); ((123, 12), <seq>)] *)
type RoomAssignment =
{ Class : int
Room : int
Date : DateTime
People : int }
func is a random function. Error results when I run code:
TermStructure.fsx(96,46): error FS0001: This expression was expected to have type
'unit'
but here has type
'RoomAssignment list'

You did not say what the func function is supposed to be doing, but judging from the error you get, it looks like it produces a list of RoomAssignment values. If you want to collect the results and have one result for each group, you probably need map rather than iter:
let datafilter =
datacsv
|> List.groupBy (fun row -> row.Class, row.Room)
|> List.map (fun ((class,room),rows) -> func rows)
If you replace func with Seq.length (for example), then this will return the number of rows in each group (as defined by a pair of class and room).
It is difficult to give a more useful answer without having a full reproducible code snippet, but hopefully this will guide you in the right direction.

Related

FSharp check if tuple type or not

I am wanting to format a tuple in a specific way and I am trying to do that by checking the type of the tuple (2 element, 3 element etc.). I am getting an error on the third line saying:
This runtime coercion of type test from type
'd
to
'a * ('b * 'c)
involves an indeterminate type based on the information prior to this program point.
Runtime type tests are not allowed on some type. Further type annotations are needed.
Here is my attempt:
let namer x =
match x with
| :? ('a * ('b * 'c)) as a, b, c -> sprintf "%s_%s_%s" (a.ToString()) (b.ToString()) (c.ToString())
| :? ('a * 'b) as a, b -> sprintf "%s_%s" (a.ToString()) (b.ToString())
| a -> sprintf "%s" (a.ToString())
How should you do something like this? I want to be able to format the string based on the type of tuple.
What I am ultimately wanting is to be able to "flatten" a nested tuple to a string without a bunch of parenthesis. For example:
// What I want
let x = (1, (2, (3, 4)))
let name = namer x
printfn "%s" name
> 1_2_3_4
Update: This is different than the question "How can i convert between F# List and F# Tuple?" found here. I know how to do that. What I am wanting is to be able to detect if I have a tuple and what type of tuple. The ideal is a generic function that could take a single element, a tuple, or nested 2 element tuples. For example, legal arguments would be:
let name = namer 1
// or
let name = namer (1, 2)
// or
let name = namer (1, (2, 3))
// or
let name = namer (1, (2, (3, 4)))
I also want to handle non-integer values. For example:
let name = namer (1, ("2", (3, "chicken")))
You can achieve this with a bit of reflection and a co-recursive function:
let isTuple tuple =
tuple.GetType() |> Reflection.FSharpType.IsTuple
let getFields (tuple: obj) =
tuple |> Reflection.FSharpValue.GetTupleFields |> Array.toList
let rec flatten fields =
List.collect namer fields
and namer (tuple: obj) =
if isTuple tuple
then tuple |> getFields |> flatten
else [tuple]
namer(1, "test") |> printfn "%A"
namer(1, ("2", (3, "chicken"))) |> printfn "%A"
Try it online!
Inspired by:
How can i convert between F# List and F# Tuple?
F# flatten nested tuples
How to have two methods calling each other?

Print list of tuple F#

I have some code I am trying to test, that is supposed to merge two int lists of same length into a tuple list. I have got it to compile but I cannot find out if it works as I am having trouble printing the result.
Here is what I have so far:
let myList = [5;15;20;25;30;200]
let myList2 = [6;16;21;26;31;201]
let rec tupleMaker (list1: int list) (list2: int list) =
match list1, list2 with
| (h1 :: tail1),(h2 :: tail2)->
let (a,b) = (h1,h2)
(a,b) :: tupleMaker tail1 tail2
| _,_->
[]
let z = tupleMaker myList, myList2
//printfn z
//printfn %A
The printfn does not work and neither has anything else I tried, any help would be greatly appreciated.
You just implemented List.zip:
List.zip myList myList2
//val it : (int * int) list =
// [(5, 6); (15, 16); (20, 21); (25, 26); (30, 31); (200, 201)]
Note: The OP has not been seen in months so this answer will probably never get an accept vote. Don't let that stop you from thinking it is not a correct answer.
First to generate a list of tuples.
This uses two different types of list to make the answer more useful in general.
let myList: int list = [1;2;3]
let myList2 : string list = ["a";"b";"c"]
let listOfTuple:(int * string) list = List.zip myList myList2
There are many ways to print a list of tuples, but the basic idea is to use List.iter to access the individual tuples in the list and then apply standard means to access the items in the tuple.
Example 1:
This doesn't use List.iter. It uses just printfn %A. This is useful when you are stuck trying to figure out why something will not print and just need to see the data as the system sees it.
printfn "%A" listOfTuple
Result:
[(1, "a"); (2, "b"); (3, "c")]
Example 2:
This uses List.iter with printfn %A. This is useful when you know the data is a list but don't know the type of the individual items.
listOfTuple |>
List.iter (printfn "%A")
Result:
(1, "a")
(2, "b")
(3, "c")
Example 3:
This uses List.iter with a tuple deconstructor, e.g. let (a,b) = values, to get at the individual values of the tuple. This is useful if you want to print every value of every item in the list.
listOfTuple |>
List.iter(
fun values ->
let (a,b) = values
printfn ("%i, %s") a b
)
Result:
1, a
2, b
3, c
Example 4:
This uses List.iter with a match statement to get at the individual values of the tuple. This is useful if you want to do more complicated processing, such as filtering before printing, or having different printing messages and/or formats for different values.
listOfTuple |>
List.iter(
fun values ->
match values with
| (a,_) when a > 1 ->
printfn ("%i") a
| (_,_) -> ()
)
Result
2
3

Extract elements from sequences, tuples

Say I have this:
let coor = seq { ... }
// val coor : seq<int * int> = seq[(12,34); (56, 78); (90, 12); ...]
I'm trying to get the value of the first number of the second element in the sequence, in this case 56. Looking at the MSDN Collection API reference, Seq.nth 1 coor returns (56, 78), of type seq <int * int>. How do I get 56 out of it?
I suggest you go through Tuple article:
http://msdn.microsoft.com/en-us/library/dd233200.aspx
A couple of exceptions that might shed some light on the problem:
Function fst is used to access the first element of the tuple:
(1, 2) |> fst // returns 1
Function snd is used to access the second element
(1, 2) |> snd // returns 2
In order to extract element from wider tuples you can use following syntax:
let _,_,a,_ = (1, 2, 3, 4) // a = 3
To use it in various collections (well lambdas that are passed to collection's functions), let's start with following sequence:
let s = seq {
for i in 1..3 do yield i,-i
}
We end up with
seq<int * int> = seq [(1, -1); (2, -2); (3, -3)]
Let's say we want to extract only the first element (note the arguments of the lambda):
s |> Seq.map (fun (a, b) -> a)
Or even shorter:
s |> Seq.map fst
And lets finally go back to your question.
s |> Seq.nth 1 |> fst
It's a tuple, so you could use the function fst;
> let value = fst(Seq.nth 1 coor);;
val value : int = 56
...or access it via pattern matching;
> let value,_ = Seq.nth 1 coor;;
val value : int = 56

Process a stream of Tuples without mutability?

So I want a function that receives an array of Tuple<int,int> and returns the same type but with different values.
What I want to do is a function that returns this kind of values:
f( [1,10; 2,20; 3,40; 4,70] ) = [2,10; 3,20; 4,30]
So as you can see, the first number is basically unchanged (except the 1st item is not picked), but the last number is the substraction of the current number with the previous number (20 - 10 = 10, 40 - 20 = 20, ...).
I've tried to come up with an algorithm in F# that doesn't involve mutability (using an accumulator for the previous value would mean I need a mutable variable), but I can't figure out. Is this possible?
Using built-in functions. In this case, you can use Seq.pairwise. The function takes a sequence of inputs and produces a sequence of pairs containing the previous value and the current value. Once you have the pairs, you can use Seq.map to transform the pairs into the results - in your case, take the ID of the current value and subtract the previous value from the current value:
input
|> Seq.pairwise
|> Seq.map (fun ((pid, pval), (nid, nval)) -> nid, nval-pval)
Note that the result is a sequence (IEnumerable<T>) rather than a list - simply because the Seq module contains a few more (useful) functions. You could convert it back to list using List.ofSeq.
Using explicit recursion. If your task did not fit one of the common patterns that are covered by some of the built-in functions, then the answer would be to use recursion (which, in general, replaces mutation in the functional style).
For completeness, the recursive version would look like this (this is not perfect, because it is not tail-recursive so it might cause stack overflow, but it demonstrates the idea):
let rec f list =
match list with
| (pid, pval)::(((nid, nval)::_) as tail) ->
(nid, nval-pval)::(f tail)
| _ -> []
This takes a list and looks at the first two elements of the list (pid, pval) and (nid, nval). Then it calculates the new value based on the two elements in (nid, nval-pval) and then it recursively processes the rest of the list (tail), skipping over the first element. If the list has one or fewer elements (the second case), then nothing is returned.
The tail-recursive version could be written using the "accumulator" trick. Instead of writing newValue::(recursiveCall ...) we accumulate the newly produced values in a list kept as an argument and then reverse it:
let rec f list acc =
match list with
| (pid, pval)::(((nid, nval)::_) as tail) ->
f tail ((nid, nval-pval)::acc)
| _ -> List.rev acc
Now you just need to call the function using f input [] to initialize the accumulator.
> let values = [(1, 10); (2, 20); (3, 40); (4, 70)];;
val values : (int * int) list = [(1, 10); (2, 20); (3, 40); (4, 70)]
> values
|> Seq.pairwise
|> Seq.map (fun ((x1, y1), (x2, y2)) -> (x2, y2 - y1))
|> Seq.toList;;
val it : (int * int) list = [(2, 10); (3, 20); (4, 30)]
Seq.pairwise gives you each element in a sequence as a pair, except the first element, which is only available as the predecessor of the second element.
For example:
> values |> Seq.pairwise |> Seq.toList;;
val it : ((int * int) * (int * int)) list =
[((1, 10), (2, 20)); ((2, 20), (3, 40)); ((3, 40), (4, 70))]
Second, Seq.map maps each of these pairs of pairs by using the desired algorithm.
Notice that this uses lazy evaluation - I only used Seq.ToList at the end to make the output more readable.
BTW, you can alternatively write the map function like this:
Seq.map (fun ((_, y1), (x2, y2)) -> (x2, y2 - y1))
Notice that instead of x1 is replaced with _ because the value isn't used.
Mark and Tomas have given really good solutions for the specific problem. Your question had a statement I think warrants a third answer, though:
(using an accumulator for the previous value would mean I need a mutable variable)
But this is actually not true! List.fold exists exactly to help you process lists with accumulators in a functional way. Here is how it looks:
let f xs = List.fold (fun (y, ys) (d, x) -> x, (d, x-y) :: ys)
(snd (List.head xs), [])
(List.tail xs)
|> snd |> List.rev
The accumulator here is the argument (y, ys) to the fun ... in the first line. We can see how the accumulator updates to the right of the ->: we accumulate both the previous element of the list x, as well as the new list we're constructing (d, x-y)::xs. We'll get that list in reverse order, so we reverse it in the end with List.rev.
Incidentally, List.fold is tail-recursive.
Of course, Tomas and Mark's solutions using Seq.pairwise are much neater for your particular problem, and you'd definitely want to use one of those in practice.
Whenever we need to create a sequence from another sequence, where one element in the output is a function of its predecessors, scan (docs) comes in handy:
[1,10; 2,20; 3,40; 4,70]
|> List.scan (fun ((accA, accB), prevB) (elA, elB) -> ((elA, elB-prevB), elB)) ((0, 0), 0)
|> Seq.skip 2
|> Seq.map fst
yields:
[(2, 10); (3, 20); (4, 30)]

How to return items in a seq only if they are different than their preceding items in F#?

When I write in python, I always try to think of alternatives as if I was using F#:
I have a seq of tuples (key, value1, value2, ...) I simplify the tuple here so it is only of length 2. Keys contain duplicated figures.
let xs = Seq.zip [|1;2;2;3;3;3;4;1;1;2;2|] {0..10} // so this is a seq of tuple
[(1, 0),
(2, 1),
(2, 2),
(3, 3),
(3, 4),
(3, 5),
(4, 6),
(1, 7),
(1, 8),
(2, 9),
(2, 10)]
Now, I would like to build a function, which takes the seq as input, and return a seq, which is a subset of the original seq.
It must captures all items where the keys are changed, and include the first and last items of the seq if they are not already there.
f1(xs) = [(1, 0), (2, 1), (3, 3), (4, 6), (1, 7), (2, 9), (2, 10)]
f1([]) = []
The following is my python code, it works, but I don't really like it.
xs = zip([1,2,2,3,3,3,4,1,1,2,2], range(11))
def f1(xs):
if not xs:
return
last_a = None # I wish I don't have to use None here.
is_yield = False
for a, b in xs:
if a != last_a:
last_a = a
is_yield = True
yield (a, b)
else:
is_yield = False
if not is_yield:
yield (a, b) # Ugly, use variable outside the loop.
print list(f1(xs))
print list(f1([]))
Here is another way, using the itertools library
def f1(xs):
group = None
for _, group_iter in itertools.groupby(xs, key = lambda pair: pair[0]):
group = list(group_iter)
yield group[0]
# make sure we yield xs[-1], doesn't work if xs is iterator.
if group and len(group) > 1: # again, ugly, use variable outside the loop.
yield group[-1]
In F#, Seq.groupBy has difference behaviour from groupby in python. I am wondering how this problem can be solved as functional as possible, and less reference cells, less mutable, and without too much hassle.
A recursive solution that should work, but also isn't particularly beautiful or short could looks something like this - but using pattern matching definitely makes this a bit nicer:
let whenKeyChanges input = seq {
/// Recursively iterate over input, when the input is empty, or we found the last
/// element, we just return it. Otherwise, we check if the key has changed since
/// the last produced element (and return it if it has), then process the rest
let rec loop prevKey input = seq {
match input with
| [] -> ()
| [last] -> yield last
| (key, value)::tail ->
if key <> prevKey then yield (key, value)
yield! loop key tail }
// Always return the first element if the input is not empty
match List.ofSeq input with
| [] -> ()
| (key, value)::tail ->
yield (key, value)
yield! loop key tail }
If you wanted a nicer and a bit more declarative solution, then you could use data frame and time series library that I've been working on at BlueMountain Capital (not yet officially announced, but should work).
// Series needs to have unique keys, so we add an index to your original keys
// (so we have series with (0, 1) => 0; (1, 2) => 1; ...
let xs = series <| Seq.zip (Seq.zip [0..10] [1;2;2;3;3;3;4;1;1;2;2]) [0..10]
// Create chunks such that your part of the key is the same in each chunk
let chunks = xs |> Series.chunkWhile (fun (_, k1) (_, k2) -> k1 = k2)
// For each chunk, return the first element, or the first and the last
// element, if this is the last chunk (as you always want to include the last element)
chunks
|> Series.map (fun (i, k) chunk ->
let f = Series.firstValue chunk
let l = Series.lastValue chunk
if (i, k) = Series.lastKey chunks then
if f <> l then [k, f; k, l] else [k, l]
else [k, f])
// Concatenate the produced values into a single sequence
|> Series.values |> Seq.concat
The chunking is the key operation that you need here (see the documentation). The only tricky thing is returning the last element - which could be handled in multiple different ways - not sure if the one I used is the nicest.
The simplest solution would likely be to convert the sequence to an array and couple John's approach with snatching the first and last elements by index. But, here's another solution to add to the mix:
let f getKey (items: seq<_>) =
use e = items.GetEnumerator()
let rec loop doYield prev =
seq {
if doYield then yield prev
if e.MoveNext() then
yield! loop (getKey e.Current <> getKey prev) e.Current
elif not doYield then yield prev
}
if e.MoveNext() then loop true e.Current
else Seq.empty
//Usage: f fst xs
I think something like this will work
let remove dup =
dup
|> Seq.pairwise
|> Seq.filter (fun ((a,b),(c,d)) -> a <> c)
|> Seq.map fst
A correct solution needs to be aware of the end of the sequence, in order to satisfy the special case regarding the last element. Thus there either needs to be two passes such that length is known before processing (e.g. Tomas's solution - first pass is copy to list which unlike seq exposes its "end" as you iterate) or you need to rely on IEnumerable methods so that you know as you iterate when the end has been reached (e.g. Daniel's solution).
Below is inspired by the elegance of John's code, but handles the special cases by obtaining the length up front (2-pass).
let remove dup =
let last = Seq.length dup - 2
seq{
yield Seq.head dup
yield! dup
|> Seq.pairwise
|> Seq.mapi (fun i (a,b) -> if fst a <> fst b || i = last then Some(b) else None)
|> Seq.choose id
}
Sorry to chime in late here. While the answers so far are very good, I feel that they don't express the fundamental need for mutable state in order to return the last element. While I could rely on IEnumerable methods too, sequence expressions are basically equivalent. We start by defining a three-way DU to encapsulate state.
type HitOrMiss<'T> =
| Starting
| Hit of 'T
| Miss of 'T
let foo2 pred xs = seq{
let store = ref Starting // Save last element and state
for current in xs do // Iterate sequence
match !store with // What had we before?
| Starting -> // No element yet
yield current // Yield first element
store := Hit current
| Hit last // Check if predicate is satisfied
| Miss last when pred last current ->
yield current // Then yield intermediate element
store := Hit current
| _ ->
store := Miss current
match !store with
| Miss last -> // Yield last element, if not already
yield last
| _ -> () }
[(1, 0)
(2, 1)
(2, 2)
(3, 3)
(3, 4)
(3, 5)
(4, 6)
(1, 7)
(1, 8)
(2, 9)
(2, 10)]
|> foo2 (fun (a,_) (b,_) -> a <> b) |> Seq.toList |> printfn "%A"
// [(1, 0); (2, 1); (3, 3); (4, 6); (1, 7); (2, 9); (2, 10)]

Resources